REMARKS 

Claims 20-28 are pending in the application and have been examined. Claims 20-28 
stand rejected. Claim 27 has been amended to correct certain formalities not related to 
patentability. Claims 29-31 have been added in this response and contain no new matter. 
Applicants respectfully request reconsideration and allowance of Claims 20-3 1 . 

The Rejection of Claim 27 Under 35 U.S.C. S 1 12, Second Paragraph 

Claim 27 stands rejected under 35 U.S.C. §112, second paragraph, for failing to 
particularly point out and distinctly claim the subject matter of the invention. Applicants have 
amended Claim 27 to conform with proper Markush terminology in accordance with the 
Examiner's suggestion. Applicants have also amended Claim 27 by omitting the phrase "(i.e., 
simple sequence repeats (SSR))" in order to clarify the subject matter of the claimed invention. 
Applicants respectfully request removal of this ground of rejection. 

The Rejection of Claims 20-28 Under 35 U.S.C. § 1 12, First Paragraph 

(Written Description) 

Claims 20-28 have been rejected under 35 U.S.C. § 112, first paragraph, as containing 
subject matter that lacks an adequate written description in the specification. The Examiner has 
taken the position that the claims contain subject matter which was not described in the 
specification in such a way as to reasonably convey to one skilled in the art that the inventors, at 
the time the application was filed, had possession of the claimed invention. According to the 
Examiner, the specification does not provide guidance for the isolation or characterization of 
DNA from any tree species other than Pinus taeda, or for the isolation and characterization of 
any other type of DNA marker other than SSRs from any tree species other than Pinus taeda. 
The Examiner further notes applicant's failure to provide a conserved nucleotide sequence which 
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encompasses molecular markers from a multitude of unrelated tree species. Applicants 
respectfully disagree for the following reasons. 

As an initial matter, applicants wish to point out that the claimed invention is directed to 
a method of tree breeding using DNA analysis to determine pedigree, and is not claiming 
specific DNA sequences. As described in the specification, DNA analysis refers to any method 
of analysis that reveals genotype information. Specification, page 12, lines 15-16. Examples of 
types of DNA analysis methods (e.g., RFLP, AFLP, etc.) are provided which can be used to 
identify molecular markers useful in the practice of the invention. Specification, page 12, lines 
15-28. Molecular markers, revealed through DNA analysis, are DNA sequence polymorphisms 
that are used as landmarks to track pedigree. Specification, page 12, lines 4-28. Applicants 
submit that it is not practicable to disclose every sequence of every marker, and further, 
disclosure of specific DNA sequences is not required to comply with the written description 
requirement for the following reasons. 

As stated in the Written Description Requirement Guidelines, the review of whether the 
disclosure satisfies the written description requirement for the claimed subject matter is 

1) conducted from the standpoint of one of skill in the art at the time the application was filed, 

2) there is an inverse correlation between the level of skill and specificity of disclosure and 3) 
information which is well known in the art need not be described in detail in the specification. 
Fed Reg Vol. 66, No. 4. Jan 5, 2001 pp. 1099-1 107, 1 105. The written description requirement 
is met "if a skilled artisan would have understood the inventor to be in possession of the claimed 
invention at the time of filing, even if every nuance of the claims is not explicitly described." Id. 
at 1106. Patents and printed publications in the art should be relied upon to determine whether 
an art is mature and what the level of knowledge and skill is in the art. Id. 
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Applicants submit Claims 20-28 are supported by an adequate written description 
because the 1) skill in the art of tree breeding was mature at the time the application was filed 
and 2) a skilled artisan would have understood based on the specification that the inventors were 
in possession of the claimed invention. 

The specification provides adequate guidance for the isolation of DNA from a multitude 
of tree species. A working example of DNA isolation is provided in the specification using the 
commercially available DNeasy 96 DNA extraction kit from Qiagen. Specification, page 22, 
line 13 to page 14, line 4. One skilled in the art would know that the DNeasy kit is useful to 
isolate DNA from a wide range of plant species. See, e.g., pages 10-15 of Qiagen's technical 
bulletin hereto attached as Attachment A. 

With respect to methods of DNA analysis for progeny pedigree determination, numerous 
examples of such methods are provided such as, for example, RFLP analysis, AFLP analysis, 
RAPD analysis, SSR analysis and so on. See e.g., page 16, lines 8-18, Table 2, and page 26, line 
18 to page 31, line 19. Applicants note that the methods of DNA analysis disclosed in the 
specification were routine, well known in the art, and widely applicable to a variety of tree 
species (see, e.g., Staub J., HortScience 31: 729-739, attached hereto as Attachment B. 

Applicants submit therefore, that one skilled in the art of plant breeding would recognize 
that the applicants had possession of methods of DNA isolation and characterization for a wide 
variety of tree species at the time the application was filed. Therefore, applicants submit that 
Claims 20-28 are supported by an adequate written description and respectfully request removal 
of this ground of rejection. 

The Rejection of Claims 20-28 Under 35 U.S.C. $ 112, First Paragraph fEnablement) 

Claims 20-28 stand rejected under 35 U.S.C. Section 112, first paragraph, for lack of an 
enabling description in the specification. The Examiner alleges lack of enablement with respect 
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to (1) the identification of molecular markers from a variety of tree species, (2) phenotypic 
determination in a multitude of tree species, (3) the use of polymix breeding coupled with 
pedigree analysis to select an elite breeding group, and (4) that the practice of the method would 
require undue experimentation. Applicant respectfully disagrees with the Examiner's 
conclusions for the following reasons. 

The test of enablement is whether one reasonably skilled in the art could make or use the 
invention from the disclosure in a patent coupled with information known in the art at the time of 
filing without undue experimentation. Not everything necessary to practice the invention need 
be disclosed, what is well-known is best omitted. All that is necessary is that one skilled in the 
art be able to practice the claimed invention, given the level of knowledge and skill in the art. 
M.P.E.P. Section 2164.08. 

A. The Specification Provides Specific Guidance on the Identification of Molecular 
Markers From a Variety of Tree Species 

According to the Examiner, the specification does not provide any guidance for the 
isolation and incorporation of molecular markers into a polymix-mediated tree-breeding 
program. Applicants note, as described above, that molecular markers are already present in the 
genomic DNA of the trees in a breeding program; therefore there is no unpredictable step of 
incorporating these pre-existing molecular markers into a polymix breeding program. Applicants 
further submit that the specification provides sufficient guidance in view of the state of the art to 
enable the claimed invention. As stated by the court in Enzo Biochem. Inc. v. Calgene, "[i]t is 
well settled that patent applications are not required to disclose every species encompassed by 
their claims, even in an unpredictable art. However, there must be sufficient disclosure, either 
through illustrative examples or terminology, to teach those of ordinary skill how to make and 
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use the invention as broadly as it is claimed." 188 F.3d at 1374, 52 U.S.P.Q.2d at 1138 (Fed. 
Cir. 1999). 

Applicants submit that the specification contains specific guidance on the identification 
of markers from a variety of tree species. Molecular methods useful in the practice of the present 
invention allow the determination or inference of an individual's genotype based upon analysis of 
that individual's chemical constituents. Specification, page 12, lines 4-14. The genotype 
information is then compared to all potential parent genotype information to infer the pedigree of 
the individual. As described above, the DNA analysis techniques disclosed in the specification 
were well known in the art at the time of filing (see, e.g., Staub et al., HortScience 31: 729-740 
(1996), attached hereto as Attachment B). 

The development and use of informative markers for pedigree analysis in a number of 
tree species is enabled by the specification in view of the state of the art at time of filing. The 
specification states that any method of molecular analysis can be used that reveals a sufficient 
number of genetic polymorphisms (variation in a base pair at a given site within members of the 
same species) to identify which parental plants are the parents of a particular progeny. 
Specification, page 14, lines 11-13. Further, an illustrative working example describing use of 
single nucleotide repeat microsatelles (SSRs) to track parentage in Loblolly pine is provided as a 
type of marker that is useful in the practice of the method of the invention. See Specification, 
Example 1 5 Example 3, Example 4 and Example 5. Specific guidance is provided in the form of 
primer sequences, including 7 primer pairs useful for analyzing chloroplast microsatellites and 3 
primer pairs useful for analyzing nuclear microsatellites. See Table 2. However, applicant's note 
there is nothing inherently unique about Loblolly pine and the markers chosen to suggest this 
method is limited to this species. The methods for detecting SSR markers that are described in 
the specification are equally applicable to the detection of any polymorphic nucleic acid marker 
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(i.e., nucleic acid amplification and assessment of distribution, inheritance, and variability of 
polymorphic markers). Specification, page 26, line 18 to page 31, line 19. Applicants wish to 
point out that the primer pairs for chloroplast provided in the specification (SEQ ID NO: 1-14), 
originally described by Vendramin et al, Molecular Ecology, 5: 595-598 (1996), are known to 
work across a wide range of species due to the high degree of sequence conservation of the 
chloroplast genome. As stated in the specification, all of the 20 microsatellite primer pairs 
(Table 2, Example 3) used to amplify simple sequence repeat regions in the chloroplast genome 
of Pinus thunbergii were found to also amplify similar size DNA fragments in P. taeda. 
Specification, page 27, lines 20-24. As further evidence of the broad applicability of the 
disclosed primers, a review by Anzidei et al. states that the same set of primers described by 
Vendramin "have been used with success in 110 different conifer species belonging to different 
taxonomic classifications, in particular to the Pinaceae, Cupressaceae and Taxodiaceae." 
Anzidei M. et al., In: European Union DGXII Biotechnology FW IV Research Programme 
Molecular Tools for Biodiversity, Gillet. E.M. ed. (1999), attached hereto as Attachment C. 

Moreover, applicants submit that numerous DNA markers in addition to single nucleotide 
repeat microsatelles were well known in the art at the time of filing for many tree species, 
including for example, chloroplast DNA markers in Douglas fir (see e.g., M.U. Stoehr et al., 
cited by the Examiner in this Office Action); amplified fragment length polymorphism (AFLP) 
markers in Populus spp. (see e.g., Cervera et al., Plant Growth Regulation, 20: 47-52 (1996), 
attached hereto as Attachment D); microsatellite markers in Magnolia obovata (see e.g., Isagi et 
al., Heredity 84: 143-151 (2000), attached hereto as Attachment E), AFLP markers in Persoonia 
mollis (Proteaceae) (see e.g., Krauss and Peakall, Aust. J. Bot, 46: 533-546 (1998), attached 
hereto as Attachment F), and RAPD markers in walnut (Juglans regis L) (see e.g., F.P. Nicese et 
al., Euphytica 101: 199-206 (1998), attached hereto as Attachment G). 
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Therefore, for the reasons described above, applicants submit that the enablement 
requirement of pending Claims 20-28 is met in view of the disclosure in the specification, 
coupled with the knowledge in the art at the time of filing. Applicants respectfully request 
withdrawal of this ground of rejection. 

B. The Specification Provides Specific Guidance on Phenotvpe Determination in a 
Multitude of Tree Species 

According to the Examiner, the specification does not provide adequate guidance on 
phenotype determination in a multitude of tree species. As an initial matter, applicants wish to 
distinguish marker-assisted selection from the method of the invention, which only uses 
molecular markers to determine the pedigree of selected progeny. Applicants submit that the 
specification provides sufficient guidance in view of the state of the art to enable the method of 
evaluating progeny trees using objective criteria to obtain a phenotypic score as claimed. The 
term "phenotype score" is described in the specification as the objective measurement of any 
phenotypic trait or characteristic that is desirable in a plant breeding program, such as, for 
example, disease resistance, growth rate, growth habit, chemical composition of any plant tissue, 
drought resistance, temperature hardiness, elevation adaptation, fecundity and breeding value. 
See e.g., Specification, page 10, lines 23-26. The term "objective criteria" is described as the 
measurement of any plant characteristic or phenotype with any detection or measurement device 
that provides statistically meaningful data regarding the characteristic or phenotype being 
measured. See e.g., Specification, page 10, lines 27-32. In addition, methods are provided for 
statistical analysis of breeding values and heritability determinations. See, e.g., Specification, 
page 13, lines 20-30. 

Further, a working example is provided describing the measurement of exemplary 
phenotypic traits including height growth, stem diameter growth, straightness, disease resistance, 
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insect resistance, general health and deformities. Specification, page 21, lines 14-20. The 
growth data was analyzed using a best linear unbiased prediction software called GAREML (Dr. 
Dudley Huber, University of Florida, Gainesville, Florida) that generated breeding values for 
growth rate for the maternal parent and for every individual progeny. Id. Therefore, applicants 
submit that the specification provides adequate guidance on phenotype determination in a 
multitude of tree species. 

C. The Specification Provides Specific Guidance on the Use of Polvmix Breeding 
Coupled With Pedigree Analysis to Select an Elite Breeding Group 

The Examiner takes the view that the specification provides no guidance regarding the 
use of the claimed method to select elite genotypes, or the use of the method in any tree species 
other than Pinus taeda. Applicants disagree with the Examiner's conclusions and submit that the 
specification provides specific guidance for selecting elite trees from the progeny. For example, 
candidate plants are identified from the progeny plants based on having at least one phenotypic 
characteristic that is statistically better, based upon objective criteria than other progeny plants. 
Specification, page 13, lines 31-35. The pedigree of the progeny plants is determined using 
DNA analysis and an elite breeding group is chosen based on high phenotypic scores and low 
levels of offspring relatedness. In some embodiments, elite plants are selected from the progeny 
plants based upon a characteristic selected from the group consisting of phenotype score, 
estimated breeding value, paternal breeding value, maternal breeding value and any combination 
thereof. See Specification at page 9, lines 27-30. For example, an elite plant that has a high 
phenotype score and has parents that are of high breeding value is particularly valuable as a 
breeding parent in the next generation. Knowledge of an elite plant's pedigree allows selection 
of the next generation of parental plants to maximize the genetic diversity of new breeding 
groups. See Specification page 16, lines 19-26. Therefore, applicants submit that the 
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specification provides adequate guidance on the use of polymix breeding coupled with pedigree 
analysis to select an elite breeding group 

D. The Practice of the Method of the Invention Does Not Require Undue 
Experimentation, 

The Examiner maintains the position that polymix-mediated breeding of trees for 
phenotypic change is unpredictable, citing Lambeth et al. 5 Theor Appl Genet (2001) 103: 930- 
943. The Examiner also relies on In re Marzocchi and Horton, 169 U.S.P.Q. 367 (CCPA 1971) 
at page 370 for the proposition that applicant's mere assertions in the response or in their own 
publication are not deemed probative to refute the evidence provided by the Examiner in the 
form of scientific reasoning and published scientific literature. 

Applicants respectfully disagree with the Examiner's interpretation of the In re Marzocchi 
case. The cited case does not support the Examiner's assertion and instead states: "[a]s a matter 
of Patent Office practice, then, a specification disclosure which contains a teaching of the 
manner and process of making and using the invention in terms which correspond in scope to 
those used in describing and defining the subject matter sought to be patented must be taken as in 
compliance with the enabling requirement of the first paragraph of Section 112 unless there is 
reason to doubt the objective truth of the statements contained therein which must be relied on 
for enabling support 169 USPQ at 369 {emphasis added). 

With respect to what constitutes undue experimentation, the following factors are 
relevant: the breadth of the claims, the nature of the invention, the state of the prior art, the 
relative skill of those in the art, the predictability of the art, the amount of guidance presented, 
the presence of working examples and the quantity of experimentation necessary MPEP 
2164.01(a), citing In re Wands, 858 F.2d 731, 737, 8 U.S.P.Q.2d 1400, 1404 (Fed. Cir. 1988). 
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Applicants submit the claimed method does not require undue experimentation for the following 
reasons. 

The Examiner relies on White (1996) in Proc. QFRI-IUFRO Conf Tree Improvement for 
Sustainable Tropical Forestry, for the proposition that the use of polymix breeding may be 
confounded by the unequal reproductive success of many parents' pollen, which would lead to 
incorrect measurements of general recombining ability. The White reference at page 133, col 1 5 
second paragraph, refers to potential issues for open-pollination related to tropical species of 
trees compared to control pollination, and does not state that polymix breeding is unpredictable. 
Applicants submit that equal fertilization success of the pollen parents of the premix is not 
essential to the success of the methods of the invention. Moreover, the claimed method actually 
serves to overcome any effect of unequal pollination through pedigree analysis, thereby allowing 
for more predictable breeding programs than polymix provides alone. See Specification, page 7, 
line 34 to page 8, line 12. 

The Examiner cites Lambeth et al., page 936 as teaching that the claimed process is 
unpredictable. Applicants disagree with the Examiner's conclusions for the following reasons. 
The cited reference describes 3 cases of a total of 45 in which observed genotypes were not 
consistent with expected genotypes. Applicants submit that these three cases do not render the 
method unpredictable. The claimed method recites determining the pedigree of a plurality of 
progeny trees and does not require pedigree assignment of every progeny. Further, as noted in 
the cited article, the small number of inconsistent cases could be attributed to mislabeled material 
or incorrect genotyping, are likely not related to the method of DNA analysis. Lambeth et al., 
page 936. Applicants note that even if cases do arise with inconsistent genotypes, the method 
provides a selection process wherein they may be simply discarded as not fit for inclusion in the 
elite breeding group. Further, the method provides additional safeguards such as the routine use 
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of markers in confirming parentage in progeny tests, or creating polymixes that avoid mixing 
pollens known to share the same paternal haplotype. Specification, page 19, line 30 to page 20, 
line 2. 

Therefore, applying the Wands factors to the instant application, it is apparent that a 
reasonable correlation exists between the scope asserted in the claimed subject matter and the 
scope of guidance the specification provides because the specification contains adequate, specific 
guidance on the identification of markers from a variety of tree species and use of the markers in 
a polymix breeding program. The specification contains a working example using the method of 
the invention to generate candidates for an elite group, and guidance is provided on the selection 
of an elite breeding group. Therefore, applicants respectfully submit that the specification 
provides an enabling disclosure for the claimed invention and request removal of this ground of 
rejection. 

The Rejection of Claims 20-28 Under 35 U.S.C. § 103(a) as Being Unpatentable Over 
Bridgwater in View of El-Kassabv and Further in View of Stoehr 

Claims 20-28 stand rejected under 35 U.S.C. § 103(a) as being unpatentable over 
Bridgwater (1992) in Handbook of Quantitative Forest Genetics, Kluwer Academic Pub., 
Dordrect, The Netherlands, pages 69-95 in view of El-Kassaby and Ritland (1992,) Theor. Appl 
Genet, 83(6-7):752-8 and Stoehr et al. (1998) Can. J. For. Res. 28: 187-95. According to the 
Examiner, it would have been obvious to one of ordinary skill in the art to utilize the method of 
polymix tree breeding taught by Bridgwater, and to modify that method by utilizing the pedigree 
analysis step in the Douglas fir polymix breeding program taught by El-Kassaby and to further 
modify that method by utilizing the DNA marker taught by Stoehr et al., as suggested by each 
reference; given the recognition by those of ordinary skill in the art that each would have 
continued to function in its known and expected manner. 
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Applicants submit that the Examiner has failed to establish a prima facie case of 
obviousness. Three requirements are listed in the M.P.E.P. Section 706.02(j) for establishing a 
prima facie case of obviousness. First, there must be some suggestion or motivation, either in 
the references themselves or in knowledge generally available to one of ordinary skill in the art, 
to modify the reference or to combine the referenced teachings. Second, there must be a 
reasonable expectation of success. Finally, the prior art references must teach or suggest all the 
claim limitations. 

For the reasons set forth in detail below, applicants respectfully submit that the burden of 
establishing a prima facie case of obviousness has not been met. First, there is no suggestion to 
combine or modify the reference's teachings to arrive at the claimed invention. Second, because 
the cited references teach away from the claimed invention, there can be no reasonable 
expectation of success for their combined teachings. Applicants remind the Examiner that 
hindsight and the guidance of the applicant's disclosure cannot be used to reconstruct the claimed 
invention. The teaching or suggestion to make the claimed combination and the reasonable 
expectation of success must both be found in the prior art, not in applicant's disclosure. MPEP 
Section 2143, citing In re Vaeck, 947 F.2d 488, 20 U.S.P.Q.2d 1438 (Fed. Cir. 1991). The mere 
fact that the prior art could be so modified would not have made the modification obvious unless 
the prior art suggested the desirability of the modification. In re Gordon, 221 U.S.P.Q. 1125 
(Fed. Cir. 1984). Also, in making a prima facie case of obviousness, the teachings of a reference 
must be taken in its entirety. 

The Examiner cites Bridgwater as teaching the advantages of polymix-mediated breeding 
such as resistance to rust disease, general combining ability and gains in additive genetic 
variation; wherein one type of polymix scheme is complete nesting involving the use of all 
pollen parents as females; and wherein the scheme generally costs less than other breeding 
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schemes such as diallel crossing. Applicants submit, however, that Bridgwater does not teach or 
suggest the use of pedigree analysis of progeny or DNA analysis to determine pedigree in a 
polymix breeding program, as required in the claimed invention. Moreover, applicants submit 
Bridgwater teaches away from using polymix in a wide variety of tree species due to lack of 
control of male pedigree based on inbreeding depression which would reduce expected genetic 
gains. See Bridgwater at page 75. 

The Examiner cites El-Kassaby as teaching the use of molecular markers such as 
isozymes to determine the pedigree of progeny from a polymix cross of Douglas fir trees. 
Applicants submit El-Kassaby actually teaches away from their invention, for the following 
reasons. El-Kassaby describes a study using a polymix of three pollen donors chosen based on 
multilocus allozyme genotypes giving unambiguous determination of paternity to study male 
reproductive success. The three males in the study showed wide variation reproductive success, 
leading to the conclusion, in concurrence with Bridgwater, that a drawback of the polymix 
breeding method is lack of male pedigree control. The solution proposed by El-Kassaby teaches 
away from the invention by suggesting the use of a polycross with few or single males to 
determine general combining ability. The references notes "due to the increased co-ancestry 
among offspring, using fewer males prevents concurrent testing and selecting." El-Kassaby at 
page 758. This teaching would not motivate one to use polymix for a breeding program with 
concurrent pedigree testing as claimed. Therefore, applicants submit that the El-Kassaby 
reference would not provide the required reasonable expectation of success for modifying the 
method of polymix breeding to include the step of concurrent pedigree analysis of progeny, nor 
does it teach the use of DNA analysis. 

The Examiner cites Stoehr et al. as teaching the use of DNA markers to identify pedigree 
in Douglas fir, wherein the technique has many advantages including increased accuracy and 
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resolution over other markers such as isozymes. However, Stoehr et al. neither suggests nor 
provides any motivation for using a pedigree and a phenotype score to identify elite trees for use 
in a next generation of tree breeding, as required by the claimed invention. Rather, Stoehr et al. 
used a polymorphic genome marker to estimate the level of outside-orchard pollen 
contamination, supplemental mass pollination efficacies and natural selfing in Douglas fir. 

For the reasons noted above, the cited references fail to teach, suggest, provide any 
motivation to make or otherwise render obvious the claimed invention. Accordingly, applicants 
respectfully request withdrawal of this ground of rejection. 

New Claims 29-31 

New Claims 29-31 depend from Claim 20 and further define the tree breeding method of the 
invention. Claim 29 depends from Claim 27 and recites the limitation that the DNA analysis is 
performed using single nucleotide analysis. Support for Claim 30 can be found, for example, in 
the Specification at page 21, line 6 to page 31, line 9. Claim 30 depends from Claim 20 and 
recites the further limitation that the breeding group consists of conifer species. Support for 
Claim 30 can be found, for example, in the Specification at page 12, line 29 to page 13, line 2, 
page 32, lines 2-5 and page 27, lines 15-19. Claim 31 depends from Claim 30 and recites the 
additional limitation that the DNA analysis is performed using single nucleotide analysis. 
Support for Claim 31 can be found, for example, in the Specification at page 21, line 6 to page 
3 1 , line 9. No new matter has been introduced. 
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CONCLUSION 

In view of the above amendments and the foregoing remarks, applicants respectfully 
submit that all the pending claims are in condition for allowance. If any issues remain that may 
be expeditiously addressed in a telephone interview, the Examiner is encouraged to telephone 
applicant's attorney. 

Respectfully submitted, 

CHRISTENSEN O'CONNOR 
JOHNSON KINDNESS PLLC 

Barry F. McGurl 
Registration No. 43,340 
Direct Dial No. 206.695 . 1 775 

I hereby certify that this correspondence is being deposited with the U.S. Postal Service in a sealed 
envelope as first class mail with postage thereon fully prepaid and addressed to Mail Stop RCE, Commissioner for 
Patents, P.O. Box 1450, Alexandria, VA 22313-1450, on the belqj^date. 
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Effect of DNA Quality on Spectrophotometry 



Introduction 





Oak 


DclbportaN. 




DNeasy 









260 280 
Wavelength (nm) 




Figure 1. Spectrophbtomerric scans (220-320 nm) of 
DNA isolated from leaves/needles using the method 
of Dellaporta, CTAB, or the DNeasy Plant Mini Kit. 
.Typically, pure DNA shows a symmetrical peak at. 
260 nm and a smooth profile. Polysaccharides and 
other secondary metabolites, often copurified with =j 
plant DNA isolated using traditional methods, 
can interfere with spectro photometric readings 
(A 26 o/A 280 ) leading to errors in determination of 
concentration and purity. 



Recent years have seen an explosion in the number and variety of 
plant molecular biology applications being used in research 
laboratories. The isolation of pure nucleic acids from plant materials 
presents special challenges, and commonly used molecular biology 
techniques often require adaptation before they can be used with 
plant samples. 

Several plant metabolites have chemical properties similar to those 
of nucleic acids, making contaminating metabolites difficult to remove 
from nucleic acid preparations. Co-purified metabolites and 
contaminants introduced by the purification procedure, such as salt 
or phenol, can cause inconsistent results in downstream applications. 
Listed below are some of the most common problems associated with 
contamination of nucleic acids prepared from plants. 

♦ Inhibition of enzymatic reactions (e.g., restriction digestion, 
reverse transcription, PCR amplification) 

^ Inaccurate UV spectrophotometry quantitation (Figure 1 ) 

^ Altered electrophoretic mobility (Figure 2) 

♦ Pipetting errors due to increased viscosity 

^ Nucleic acid degradation during storage 

This guide gives an overview of the techniques used for plant nucleic 
acid purification and provides useful guidelines for successful results. 

Effect of DNA Quality on Electrophoretic Mobility 
□ Low-Quality DNA 




Figure 2. Agarose gel [0.8% TBE) analysis of genomic DNA isolated from plant leaves. 
A: Low-quality DNA containing residual impurites, which hinder migration of DNA out 
of wells and cause non-uniform eletrophoretic mobility. B: High-quality DNA, which 
shows uniform electrophoretic mobility. Left and right end lanes are markers. 
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Plant starting materials 

Isolating intact, pure nucleic acid from plant cells presents special 
challenges for researchers who study plants. The methods used for 
growing, harvesting, storing, and preparing plant tissues can 
influence subsequent nucleic acid purification. 

Often, the experimental design dictates which methods are used to 
prepare plant materials. For example, gene expression analysis may 
involve specific growth conditions or plants of different ages. DNA 
may be required from herbarium or forensic samples. However, 
several parameters, including growth conditions, harvesting method, 
storage, and tissue disruption technique, can be adjusted when 
possible to make subsequent isolation easier. 

This chapter provides information and general guidelines which may 
be helpful to optimize conditions for successful nucleic acid 
purification. 



Effect of growth conditions on nucleic ceid isolation 

Growth conditions influence the production and accumulation of 
plant metabolites such as polysaccharides, polyphenolics, and 
flavones. The efficiency of many nucleic acid isolation techniques is 
affected by the presence of plant metabolites, and the presence of 
these compounds in RNA and DNA preparations can reduce 
performance in downstream applications (Figure 3). Many nucleic 
acid isolation methods recommend growing plants under conditions 
which do not induce high-level accumulation of plant metabolites. 

Metabolite production and accumulation may be affected by several 
factors: 

^ Stress (e.g., induced by wounding, desiccation, pathogen 
infection, or nutrient deficiency) 

^ Light intensity, spectrum, and duration 

^ Plant age 

Because of the great variation between plants, it is very difficult to 
make general statements about the production and accumulation of 
plant metabolites. However, as a general guideline, it is recommended 
to use healthy, young tissues when possible. In addition, many 
protocols for "home-made" methods recommend growing plants in 
darkness for 1 to 2 days before harvesting to prevent high-level 
accumulation of plant metabolites. 



Effect of Carbohydrate Contamination 
on PGR Performance 




Figure 3. DNA was isolated from wheat leaves using the : 
DNedsy Plant Mini Kit. Reaction: Amplification reactions 
werelprepared using 40 rig DNA, T . uhitQIAGEN JTdq • 
DNA Polymerase, the indicated concentrations' of . xylan, 
and uniyerscl, primers, for the noncodihg iritergenic; spacer- 
region between the chloroplast tRN A genes trnL (UAA) 
5' exon and trhL (UAAJ 3' exon of chloroplast DNA 
(reference 3). Lysate: xylan was added to wheat tysates 
before purification soilthat the xyla n [ concentration in - the 
el uate would be. 2.5% if it ; were not "removed during. the 
DNeasy Plant isolation procedure. Amplification reactions 
were prepared as above. M: markers. ; 
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Harvesting tissues 

The content of nucleic acid and other plant metabolites can vary 
widely between plant species as well as between organs of the same 
plant or plants of different ages. Many tissue characteristics affect the 
ease, efficiency, and yield of nucleic acid purification. Examples of 
such properties include gene-expression levels, ploidy, vacuolization, 
and cell wall properties such as lignin content (for an example, see 
"Nucleic acid content of plant tissues", page 10). 

When possible, it is preferable to collect young material (e.g., 
expanding leaves or needles). Nucleic acid yields from young tissues 
are often higher than from old tissue, because young tissue generally 
contains more cells than the same amount of older tissue. In addition, 
young tissue of the same weight contains fewer metabolites which can 
affect the performance of downstream applications if not completely 
removed during nucleic acid purification (see Figure 3, page 5). 

When using fresh leaves for mini-preparation, tissue can be harvested 
by cutting discs (e.g., with a hole puncher) and collecting the disks in 
the lid of a microcentrifuge tube. A leaf disk with a 1 .5 cm diameter 
weighs 25-75 mg. 

Storage of harvested tissues 

Tissue damage can result in degradation of nucleic acids. Since 
tissue can rarely be processed immediately after harvesting, storage 
conditions that preserve the integrity of the nucleic acids contained in 
the sample are essential. Improper storage is particularly damaging 
to RNA, although it can also influence DNA quality. 

When DNA is to be isolated, leaves and needles from most species 
can be stored for up to 24 hours at 4°C without affecting yield or 
quality. In general, samples that are to be stored for longer than 
24 hours should be frozen and kept at -80°C. However, some 
samples, for example, tree buds, can be stored for several days at 
4°C. Tissues stored at 4°C should be kept in a closed container to 
prevent dehydration. Large samples (e.g., branches) can be stored in 
a plastic bag containing a wet paper towel. 

For RNA isolation, plant material should be frozen in liquid nitrogen 
immediately after harvesting. Frozen samples can be stored at -80°C 
indefinitely for later processing. For convenience and efficient use of 
space, frozen tissue can be disrupted under liquid nitrogen (see 
"Disruption of plant materials", page 7) and the resulting powder 
stored at -80°C. 
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When it is not practical to store frozen samples for DNA preparation, 
a number of methods are available for drying plant tissue using, for 
example, silica gel, food dehydrators, or lyophilizers (1). To prevent 
DNA degradation, material should be completely desiccated in less 
than 24 hours. Dried samples should be kept in darkness at room 
temperature under desiccating or hermetic conditions for long-term 
storage (2). Depending on how the material was handled, the DNA 
in herbarium and forensic samples may be degraded. When such 
samples are to be used in PCR, primers should be designed to amplify 
short (<500 bp) segments. 

Disrupted plant material can also be stored in DNeasy® lysis buffer 
API at room temperature for several months without appreciable 
DNA degradation (Figure 4). 

Disruption of plant material 

Complete disruption of cell walls, plasma membranes, and organelle 
membranes is essential to release all the nucleic acids contained in 
tissue. Insufficient disruption of starting material will lead to low yield. 
Cell wall properties vary widely between different species and 
different methods are required to achieve complete disruption. 

Disruption mmg mortar and pestle 

The most common disruption method involves freezing samples in 
liquid nitrogen and grinding with a mortar and pestle (Table 1). 



Safe Sample Storage using Buffer API 



Fresh 



Stored 6 months 




Figure 4. Genomic DNA was isolated from young wheat 
iJeaves using the DNeasy Plant MinrKit. After disruption, 
Jysis buffer API was added to theltissue and the mixture 
was incubated at 65°C for 10 Win. DNA isolation 1 was 
carried out according to the protocol either immediately- 
dfter-^lysis (Fresh) - or after 6 months storage at room, 
temperature (Stored 6 month). Eluates were run on q 
. 0.8% TBE agarose gel. M: markers. 



Table 1 . Typical protocol for disrupting plant samples using mortar 
and pestle 



1 . Freeze tissue in liquid nitrogen immediately after harvesting. 
Do not let the sample to thaw at any time during disruption. 

2. Precool mortar to -20°C and keep on dry ice. 

3. Pour liquid nitrogen into the mortar, and precool pestle by 
placing the grinding end in the liquid nitrogen. 

4. Place frozen tissue in mortar and grind until a fine, whitish 
powder results. 

5. Add liquid nitrogen as necessary, being careful the sample 
does not spill out of the mortar. 

6. Using a precooled spatula, transfer the powder to pre-cooled 
containers of the appropriate size. To avoid thawing, large 
samples may be transferred to several containers. 

7. Ensure all liquid nitrogen has evaporated before closing the 
container. To prevent the sample from thawing after evaporation, 
the container should be cooled by placing it in dry ice or liquid 
nitrogen. 



1 

i 
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Several modified protocols using liquid nitrogen and small 
containers, such as microcentrifuge tubes, are available for disruption 
of small tissue samples. Most protocols are adapted for a specific 
tissue (e.g., young leaves) and include crushing tissue with a glass 
rod, plastic pestle, or wooden stick. These modified methods may 
result in DNA yields which are 20-80% of the yields obtained using 
standard disruption using a mortar and pestle. A side-by-side 
comparison of modified and standard methods is recommended 
before using a modified method for tissue disruption. Modified 
disruption methods are not recommended for RNA isolation, as 
variation in yield prevents accurate quantitative analysis. 

Tissue powder can be used directly for nucleic acid purification. 
After disruption and lysis, the lysate may be viscous and must be 
homogenized (see page 9). Homogenization is particularly important 
for RNA isolation. 

Disruption using a rotor-$fator homoganator 

Rotor-stator homogenizers disrupt relatively soft plant tissues in the 
presence of lysis buffer. The rotor turns at a very high speed causing 
the sample to be disrupted and homogenized by a combination of 
turbulence and mechanical shearing. These homogenizers may not 
disrupt tough tissue, such as roots, and their use is not universally 
recommended. 

Rotor-stator homogenizers are available in different sizes and operate 
with differently sized probes. Probes with diameters of 5-7 mm are 
suitable for volumes up to 300 ul and can be used for homogenization 
in microcentrifuge tubes. Probes with diameters of >10 mm require 
larger tubes. 

When disrupting tissue with a rotor-stator homogenator, foaming 
should be kept to a minimum by using vessels of the appropriate 
size and by keeping the tip of the homogenizer submerged. 

Disruption using a mixer mi!! 

Mixer milling disrupts cells and tissues by rapid agitation in the 
presence of beads made of tungsten carbide, steel, or glass. 
Disruption is caused by the shearing and crushing action of the 
beads as they collide with the cells. When using fresh leaf tissue, 
most samples can be disrupted in the presence of lysis buffer. 
Alternatively, disruption of frozen plant material can be performed 
without lysis buffer if the beads and disruption vessel are precooled 
with liquid nitrogen. Samples should be disrupted in the presence of 
either lysis buffer or liquid nitrogen to preserve the quality of the 
contained nucleic acids. 
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Optimal disruption parameters must be determined empirically for 
each application. Disruption efficiency is influenced by the following 
factors. 

^ Size and type of bead 

^ Speed and configuration of agitator 

♦ Duration of disruption 

^ Amount of starting material 

The Mixer Mill MM 300 is the first commercially available high- 
throughput system designed for simultaneous, rapid, and effective 
disruption of plant samples (Figure 5). This mixer mill allows up to 
1 92 samples to be disrupted in just 2-4 minutes, and is ideal for 
use with high-throughput purification formats such as the DNeasy 96 
Plant Kit. 

Recommendations for disruption and homogenrzafion 

In our hands, similar DNA yields were obtained using either a mortar 
and pestle or mixer mill. Disruption efficiency using other methods 
depends on the starting material (e.g., tissue type, plant age), and 
yields may be 20-80% of those obtained using a mortar and pestle 
or mixer mill. Incomplete disruption always results in reduced yields. 

When preparing plant nucleic acids, samples disrupted by grinding 
in a mortar and pestle must be further homogenized to reduce 
viscosity caused by high-molecular-weight cellular components such 
as complex carbohydrates. 

The Ql Ash redder™ unit has been designed for efficient, cross- 
contamination-free homogenization of cell and tissue lysates. Lysate 
is loaded onto the QIAshredder spin column which is placed in a 
collection tube. After spinning in a centrifuge, the homogenized 
lysate is collected. Use of the QIAshredder unit also improves nucleic 
acid purification by removing cell debris and precipitates from 
cleared lysates, and is included in all DNeasy Plant and 
RNeasy® Plant kits. 



Efficient High-Throughput Disruption 
of Plant Samples 




Figure 5. The Mixer Mill MM 3Q0 allows processing.;;! 
of- up' to 192 plant samples in just 2—4 minutes. 
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DNeasy Plant Procedures 
Mini, Maxi 96 



Collect samples 



Grind, lyse, & 




Centrifuge 
through 
QlAshredder 
unit 



Add binding 
buffer & 
bind DNA 




Wash 



Elute 




4» 

Reddy-to-use DIsIA 



figure 6.* The DNeasy Mini, Maxi, and 96 Plant 
procedures. ■ ■. t 



Nucleic acid content of plant tissi^es 

The nucleic acid content can vary widely between different plant 
starting materials. For example, a tissue sample comprised of small 
cells will have a higher cell density, and therefore is likely to contain 
more nucleic acids than a sample of the same size which is comprised 
of larger cells. In addition, DNA contents depend on the haploid 
genome size and the ploidy of the sample. Arabidopsis has a small 
diploid genome and correspondingly lower DNA yields than wheat 
which has a large hexaploid genome {see Table 2). RNA content 
varies less predictably than DNA content. Highly proliferating tissues, 
such as meristems, typically contain more RNA than mature tissues. 
This variation in nucleic acid content should be considered when 
purifying nucleic acids. 

Table 2. Typical DNA yields from arabidopsis and wheat 



Plant 



Genome size 



Arabidopsis 
Wheat 



1.9 x 10 a bp 
1.7 x 10'° bp 



Ploidy 

Diploid 
Hexaploid 



Typical yield from 
1 00 mg fresh tissue 

30 ug 



DNA isolation methods 

Plant molecular biology studies often require a simple, rapid, and 
reproducible method for preparing DNA from a wide variety of 
species. A number of factors that can affect the yield and purity of 
DNA must be considered, including incomplete cell lysis and 
carryover contamination of carbohydrates, polyphenol ics, flavones, 
and other metabolites. DNA isolation methods must be scaleable for 
different sample sizes and provide sufficient throughput to meet 
demanding project timelines. 

DNeasy 51 Plan* kite 

With the DNeasy Plant procedure (Figure 6), plant cells or tissues are 
first mechanically disrupted and then lysed by the addition of lysis 
buffer and incubation at 65°C. During this step, RNase contained in 
the lysis buffer digests RNA in the lysate. After lysis, proteins and 
polysaccharides are removed by salt precipitation. Precipitates and 
cellular debris are removed in a single step by a brief spin through 
a QlAshredder unit (see page 9). 
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The cleared lysate is transferred to a new tube and a binding buffer 
containing ethanol is added to promote binding of the DNA to the 
DNeasy membrane. The sample is then applied to a DNeasy spin 
column or a 96-well plate and spun briefly in a centrifuge. 
Contaminants, such as proteins and polysaccharides, are efficiently 
removed by two stringent wash steps. The highly specific binding 
properties of the DNeasy membrane allows efficient purification and 
eliminates the need for additional extraction or precipitation steps 
which are often required for traditional isolation methods. Pure DNA 
is eluted in a small volume of low-salt buffer or water. 

DNeasy purified DNA typically has A 26 o/A 2 80 ratios °f 1 - 9 > 
and absorbance scans show a symmetric peak at 260 nm, 
confirming high purity. 

The DNA-binding DNeasy membrane is available in either spin- 
column or 96-well formats. The combination of easy high-throughput 
disruption using the Mixer Mill MM 300 (see page 8) and reliable 
purification using the DNeasy 96 Plant Kit provides convenient DNA 
isolation from plant tissues in 96-well format (Figure 7). 

DNeasy Plant Kits have been used to isolate high-quality DNA from 
a wide range of plant species and tissues, including troublesome 
sources rich in polysaccharides and polyphenolics (Figure 8 and 
Table 3, next page). 



PCR Analysis of DNA from Different Plant Species 



Reproducible DMA Yield and PGR 
Performance 



M a? 



# # •* £ £ ^ $ / / ^ * / ^ 

/V ^ / 4? / «f «? / <? M 



; Reproducible: yield 

M v.;"--.-: • . M 



v Sunflower 
6.5 ug 




Reliable PCR 




Sunflower 



Lupin 



Rape 



Wheat 



Figure 7: Genomic ;DNA was purified from different : plant 
^speciesjusing the r DNeasy 96 Plant Kit following disruption 7 
in.liquid:hitrb§en/DNA^was eluted in 2 x 100 pi Buffer AE. 
A: Teh ; pi. of each eluate was loaded per, lane. Average 
yields are given next to the plant names.' B: Genomic DNA 
purified from different plant species was amplified by PCR. 
Five ul of each eluate was used as template, and 30 ul of 
each PCR product was loaded in each lane. ;M: markers: 



Figure 8. DNA (10 ng) from the indicated leaves or needles was amplified using 
universal primers for the noncoding intergenic spacer between the tRNAgenes trnL (UAA) 
5' exon and trnL (UAA) 3' exon of cpDNA (reference 3). M: markers. 
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Table 3. Selection of plant species processed 
with DNeasy Plant kits 



Abies alba (silver fir) 
Aesculus hippocastanum 
(horse chestnut) 
Arabidopsis fhaliana 
(thale cress) 

Avena sp. (oat} 
Brassica napus 
(oilseed rape} 
Brassica oleracea 
[kohlrabi) 

Chicorium endivia (chicory) 

CitruIIini lanatus 
(water melon} 
Egeria sp. 

Fagus syrvatica (beech)' 
Helianthus spp. (sunflower) 
Hordeum vulgare (barley) 2 
Humulus sp. (hops) 
Hydrilla sp. 
Kabnchoe spp. 
lupinus sp. 

lycopersicon esculentum 
(tomato) 3 
Myriophyllum sp. 



Nicotiana tabacum 
(tobacco) 

Oryza sativa (rice) 4 
Pelargonium sp, 
(geranium) 4 
Petunia sp. 4 

Pinus sylvestris (Scotch pine) 
P. brutia 5 

Populus tremula (aspen) 
Pseudotsuga menziesii 
[Douglas fir) 
Quercus robur 
Q; petrea (oak) 67 
Rhododendron sp. 2 ' 4 
Rudus idaeus (raspberry) 
Sdanum tuberosum 
(potato) 

Sphagnum palustre |moss) 

Spinacia oleracea 
(spinach) 

Taxus baccata (yew) 
Triticum aestivum (wheat) 4 
Ulmus glabra (elm} 6 
Vitis spp. (grape) 6 
Zea mays (maize) 



Young leaves or needles (and other tissues, as indicated) were 
collected and immediately Hash frozen. QNA isolation was then 
performed; with the DNeasy: Plant Mini Kit. 'Beechnut, 'dried 
leaves, 'callus, "leaves from adult plant, 'endosperm, *old leaves, 
rich in carbohydrates, fbuds. For more information on DNA iso: 
lation from other species including fungi, call GIAGEN Technical 
Services or your;Jocql distributor, 



CTAB lysis 

This "home-made" DNA isolation method uses the detergent 
cetyltrimethylammonium bromide (CTAB) to lyse plant cells 
(4-6). After lysis, contaminants are removed by a chloroform 
extraction step. During extraction, it is essential that the correct salt 
concentration is used to ensure that contaminants are separated into 
the organic phase and DNA stays in the aqueous phase. DNA is 
recovered from the aqueous phase with a subsequent precipitation 
either by adding alcohol or lowering the salt concentration so that 
DNA forms insoluble complexes with the CTAB. DNA preparations 
isolated using this method may contain enzyme-inhibiting 
contaminants and therefore may not be sufficiently pure for sensitive 
downstream applications such as PCR (Figures 9 and 10). 

Effect of Contaminants on PCR Performance 



SDS 




CTAB 




NaCl 






cv 


o* 








1 2 


3 4 


5 


6 7 


8 


9 




Figure 9. DNA was isolated from oats (lanes 1-3), spinach (lanes 4-6), or kohlrabi 
(lanes 7-9) using the DNeasy Plant Mini Kit. A 400 bp fragment was amplified from 
40 ng purified DNA using universal primers for the noncoding intergenic spacer region 
between the chloroplast tRNA genes trnL (UAA) 5 1 exon and trnL (UAA) 3' exon 
of cpDNA (reference 3). Reactions were prepared with the indicated concentrations of 
SDS, CTAB, or NaCl (contaminants typically found in DNA solutions prepared using the 
Dellaporta or CTAB methods). M: markers. 



Table 4. DNeasy Plant kit specifications 



* DNA content of most samples does not exceed the binding capacity of the DNeasy membrane. 

1 Nucleic acid content varies widely between different sources; see "Nucleic acid content of plant tissues", page 10. 



DNeasy Plant Mini DNeasy Plant Maxi DNeasy Plant 96 < 


Amount starting material (maximum) 


100 mg wet 1 g wet 50 mg wet 
20 mg dry 200 mg dry 1 0 mg dry 


DNA isolated 


Total DNA (genomic, chloroplast, mitochondrial) 


Size of isolated DNA 


Up to 40 kb, average 20-25 kb * 


DNA binding capacity* 


50 ug 500 ug 50 ug 


Typical DNA yield T 


3-30 ug 30-260 ug 2-1 2 ug 


Elution volume (minimum) 


50 pi 500 ul 50 pi t- 


Processing time 


<1 h <2h <2h 

— » ■■ , ...,,,„„- -zzTrrr^z* 
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Comparison of GAB and DNeasy Methods: PCR performance 




— DNeasy 



— CTAB 



Figure 10. 

A; DNA was isolated from arabidopsis 
leaves using either CTAB lysis (CtAB) 
or the DNeasy Plant Maxi Kit (DNeasy). 
Amplification reactions were prepared 
using purified DNA (1 : 50 pg; 2: 100 
pg) and primers to the Akin 1 0 gene. 
M: markers. (Data kindly provided by 
Alain Lechamy, Institut de Biotechnologie 
des Plantes, UMR CNRS-UPS Orsay, 
France.) 



B: DNA was purified from petunia using the 
DNeasy 96 Plant Kit (DNeasy) and a 
conventional CTAB-based purification method 
(CTAB). The purified DNA was used as 
template in a PCR to amplify a 600 bp 
fragment of the glucanase gene. M: markers. 
(Data kindly provided by M. D'Hauw and 
T. Gerats, Department of Plant Genetics, 
University of Gent, Gent, Belgium.) 



Deflaporta (salting-out) method 

This method (7) involves grinding plant tissue in an SDS-containing 
lysis buffer, filtering the lysate, and precipitating proteins and other 
compounds contained in the lysate with high salt concentrations. 
Removal of proteins and other contaminants using this salting-out 
method may be inefficient. RNase treatment and repeated alcohol 
precipitation are typically necessary before the DNA can be used in 
downstream applications. Difficult samples may require manual 
removal of precipitated DNA from the alcohol suspension to reduce 
coprecipitation of contaminants by centrifugation. However, this step 
may not sufficiently remove contaminants, and DNA preparations 
may contain enzyme-inhibiting compounds. Yields and purity using 
this method are often variable. 



ROSE method 

The ROSE method involves disruption and lysis of plant cells followed 
by incubation at high temperatures (90°C for 20 minutes, 8). The 
lysate is then used directly in downstream applications. Considered 
a "quick-and-dirty" technique, this method may not be suitable for 
extremely sensitive applications because isolated DNA often contains 
enzyme-inhibiting contaminants (see Table 5, next page). 
Furthermore, high levels of contamination often result in DNA 
degradation during storage. Therefore, the ROSE method is 
appropriate for a limited range of applications. 
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Table 5. Comparison of DNeasy and ROSE methods: 
PCR using DNA isolated from gymnosperm species 





Standard PCR 


RAPD 


T 


Species (family) DNeasy 


ROSE 


DNeasy 


ROSE 




Cycas circinalias L (Cycadaceae) 


0 


3 


3 


0 


: 


Ginkgo biloba L (Ginkgoaceae) 


J 


3 


3 


0 


Gnetum gnemon L. (Gnetaceae) 


3 


3 


3 


0 


; 


Ephedra distachya ssp. helvetica L. (Ephedraceae) 


3 


2 


3 


0 




Abies alba Mill. (Pinaceae) 


3 


2 


3 


0 




Cedrus atlantica (Endl.) Manetti ex Carr. (Pinaceae) 


3 


3 


3 


0 




Larix decidua Mill. (Pinaceae) 


3 


2 


3 


0 


A 

'-• 


Picea abies (L.) Karst. (Pinaceae) 


3 


3 


3 


2 




Pinus sylvestris L (Pinaceae) 


3 


3 


3 


2 




Pseudotsuga menztesii (Mirb.) Franco (Pinaceae) 


3 


2 


3 


3 




Tsuga canadensis (L.) Carr. (Pinaceae) 


3 


3 


3 


0 




Podocarpus lawerncei Hook.f. (Podocarpaceae) 


3 


1 


3 


0 




Agathis brownii LH. Bailey (Araucariacea) 


3 


0 


3 


0 




Araucaria anqustifolia (Bertol.) Kuntze (Araucariacea) 

' . . z. ...w * .1 — \ 


3 


0 


3 


0 




Sciadopitys verticillata (Thunb.) 


3 


3 


3 


o 




Schinz and Zucc. (Scaidopitaceae) 












Taxus baccata L. (Taxaceae) 


3 


3 


3 


0 


? 


Torreya nucifera (L.) Schinz and Zucc. (Taxaceae) 


3 


3 


3 


3 




Cephalotaxus harringtonia van drupaceae 


3 


3 


3 


3 




(Forbes) K. Koch (Cephalotaxaceae) 











\ 


Cryptomeria japonica (L.f.) D. Don (Taxodiaceae) 


3 


0 _ 


0 


0 




Cunninghamia lanceolata (Lamb.) Hook. fTaxodiaceae) 


3 


0 


3 


0 




Metasequoia glyptostroboides Hu and Cheng (Taxodiaceae) 


3 


1 


3 


0 


J 


Sequoia sempervirens (D. Don) 


3 


1 


3 


0 




Sequoiadendron giganteum (Lindl.) Buchh. (Taxodiaceae) 


3 


2 


3 


0 


Taxodrum distichum (L) A. Rich. (Taxodiaceae) 


J 


0 


3 


0 


\ 


Callitris preissii Miq. (Cupressaceae) 


3 


3 


3 


0 




Calocedrus decurrens (Torrey) Florin Cupressaceae) 


3 


3 


3 


0 




Cupressocyparis x leylandii Da Hi more 


3 


2 


3 


0 




and A.B. Jackson (Cupressaceae) 












Cupressus arizonica Greene (Cupressaceae) 


3 


3 


3 


0 




Tetraclinis articulate (Vahl) Masters (Cupressaceae) 


3 


0 


3 


0 




Thuja plicata Donn ex D. Don (Cupressaceae) 


3 


1 


3 


0 


\ 


Thujopsis dolabrata Schinz and Zucc. (Cupressaceae) 


3 


0 


3 _ 


0 




Waddringtonia cedarbergensis J. A. Marsh 


3 


0 


3 


0 




(Cupressaceae) 













DNA was extracted from 32 gymnosperm species using either the DNeasy Plant kit or the ROSE method. Isolated DNA was PCR amplified 
in triplicate using either primers to an intronic region of the trnL (UUA) chloroplast gene or RAPD primers (reference 9). The table indicates 
the number of successful reactions. (Data kindly provided by C. Sperison, F. Gugerii, U. Buchler, and G. Matym, Biodiversity Department, 
Swiss Federal Research Institute, WSL, Btrmendorf, Switzerland.) 



Plant, Nucleic- Acid Purification 



CsCi density grcdienf 

Plant DNA can be isolated by centrifugation through a cesium 
chloride (CsCI) density gradient (10). After plant cells are lysed with 
detergent and treated with protease, the cleared lysate is precipitated 
with isopropanol. The resuspended DNA is then mixed with CsCI 
and ethidium bromide and centrifuged for several hours. Although 
this method allows the isolation of high-quality DNA, it is time 
consuming, labor intensive, and expensive, making it inappropriate 
for routine use. 

Comparison of plant DNA purification methods 

Table 6 summarizes several features of the DNA purification methods 
mentioned in the previous sections. 



Table 6. Comparison of plant DNA purification methods 





DNeasy Plant kit 


CTAB 


Dellaporta 


ROSE 


CsCI gradient | 


Sample source 


Plant cells 
and tissues 


Plant cells 
and tissues 


Fresh plant 
tissue 


Plant tissues 
and cells 


Plant tissues* 


Can be used with 
a broad range of 
plant species? 


Yes 


No r 


No T 


No 1 


i 
[ 

Yes 


DNA quality 


High 


Medium 


Low 


Very low 


High ! 


Alcohol precipitation 
required? 


No 


Yes 


Yes 


No 


- - ■■- s 

Yes | 


Preparation time 
for 24 samples 


<1 h« 


2-4 h 
(plus overnight 
resuspension) 


2-4 h 
(plus overnight 
resuspension) 


1 h 


12 h 
(for 8 samples) 5 


Reproducibility 


High 


Variable 


Variable 


Poor 


High** 


Method convenient? 


Yes 


Moderately 


No 


Yes 


No 


DNA storage 


Long-term 


Long-term 


Short-term 


No 


Long-term 


Performance in 

downstream 

applications 


Excellent 


Moderate 


Poor 


Very poor 


Excellent 



* It is recommended that plants are grown in the dark for 2 days before isolating DNA. 

' Protocol may need to be optimized for some species. 

1 Using the DNeasy Plant Mini Kit. 

s Does not include the recommended 2 day dark treatment. 

** Depends on handling. 



Plant Nucleic Acid- Purific^ 
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The genetic improvement of a species 
through artificial selection depends on the 
ability to capitalize on genetic effects that can 
be distinguished from environmental effects. 
Phenotypic selection based on traits that are 
conditioned by additive allelic effects can pro- 
duce dramatic, economically important 
changes in breeding populations. Genetic 
markers — heritable entities that are associated 
with economically important traits — can be 
used by plant breeders as selection tools 
(Beckman and Soller, 1 983; Darvasi and Soller, 
1994). Marker-assisted selection (MAS) pro- 
vides a potential for increasing selection effi- 
ciency by allowing for earlier selection and 
reducing plant population size used during 
selection. Nevertheless, the phenotypic varia- 
tion that marker loci define is often nonaddi- 
tive, and is a function of genetic linkage, 
pleiotropy , and environment (Lark et al. , 1 995). 
Thus, the efficiency of application of marker 
loci as predictors of phenotypic variation de- 
pends on many factors, and predictions of 
response to selection (R) or genetic gain (A G) 
are often difficult. 

The predictive value of genetic markers 
used in MAS depends on their inherent repeat- 
ability (Weeden et al., 1992), map position, 
and linkage with economically important traits 
(quantitative or qualitative). The presence of a 
tight linkage (<10 cM [centimorgan]) between 
qualitative trait(s) and a genetic markers) 
may be useful in MAS to increase gain from 
selection (Kennard et al., 1994; Paran et al., 
1991; Timmerman et aL, 1994). Likewise, 
selection for multiple loci or quantitative trait 
loci (QTL) using genetic markers can be effec- 
tive if a significant association is found be- 
tween a quantitative trait and markers (Edwards 
and Page, 1994; Edwards et al., 1987; Lande 
and Thompson, 1990). 

Often the biotechnological information 
presented in research reports is not tied di- 
rectly to classical genetic methodologies and 
the sophisticated technology presented results 
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in a bewildering array of new terms. For scien- 
tists who have a peripheral interest in genome 
mapping, but would like to understand the 
potential role of MAS in plant improvement, 
the wealth of information currently being pro- 
duced in this area can lead to considerable 
confusion. The purpose of this paper is to 
describe available marker types and examine 
factors critical for their use in map construc- 
tion and MAS. This review clarifies how ge- 
netic markers are used in map construction 
and defines the potential use of genetic maps 
for MAS. 

MARKER TYPES 

Morphological Morphological traits con- 
trolled by a single locus can be used as genetic 
markers if their expression is reproducible 
over a range of environments. Although 
codominant morphological markers have been 
useful as predictors of genetic response to 
selection, they can be influenced by environ- 
mental and genetic factors (e.g., epistasis). For 
instance, the expression of the determinate 
(de) character in cucumber {Cucumis sativus 
L.) may vary, depending on growing environ- 
ment and modifying genes (Staub and 
Crubaugh, 1995). Thus, a description of such 
a trait has significance only when accompa- 
nied by properly documented pedigree infor- 
mation and environmental conditions. The 
fact that such factors may modify a gene's 
expression of phenotype may limit its useful- 
ness as a genetic marker. A further drawback 
of morphological markers is that they may 
present an altered phenotype that interferes 
with grower needs. 

Isozymes. Isozymes are differently charged 
protein molecules that can be separated using 
electrophoretic procedures (usually starch gel) 
(Markert and Moller, 1959). Since enzymes 
catalyze specific biochemical reactions, it is 
possible to visualize the location of a particu- 
lar enzyme on a gel by supplying the appropri- 
ate substrate and cofactors, and involving the 
product of the enzymatic reaction in a color- 
producing reaction. The colored product be- 
comes deposited on the gel, forming a visible 
band where a particular enzyme has been 
electrophoretically localized. Bands visual- 
ized from specific enzymes represent protein 
products, have a genetic basis, and can provide 
genetic information as codominant markers. 
However, the paucity of isozyme loci and the 
fact that they are subject to post-translational 
modifications often restricts theirutility (Staub 
et al., 1982). 



RFLPs. Restriction fragment length poly- 
morphisms (RFLPs) are detected by the use of 
restriction enzymes that cut genomic DNA 
molecules at specific nucleotide sequences 
(restriction sites), thereby yielding variable- 
size DNA fragments (Fig. 1), Identification of 
genomic DNA fragments is made by Southern 
blotting, a procedure whereby DNA fragments, 
separated by electrophoresis, are transferred 
to nitrocellulose or nylon filter (Southern, 
1 975). Filter-immobilized DNA is allowed to 
hybridize to radioacti vely labeled probe DNA. 
Probes are usually small [500 to 3000 base 
pairs (bp)], cloned DNA segments (e.g., ge- 
nomic or cDNA). The filter is placed against 
photographic film, where radioactive disinte- 
grations from the probe result in visible bands. 
Such bands are visualizations of RFLPs, which 
are codominant markers. 

The polymerase chain reaction (PCR) has 
been used to develop several DNA marker 
systems (Fig. 1). Three strategies primarily 
have been employed in the development of 
PCR-based marker systems. These include: 1) 
markers that are amplified using single prim- 
ers in PCR, where marker system diversity 
results from variation in the length and/or 
sequence of primers, and where anchor nucle- 
otides are present at 5 ' or 3 ' termini of primers 
(e.g., RAPDs, SPARs, DAFs, AP-PCR, SSR- 
anchored PCR; see below); 2) markers that are 
selectively amplified with two primers in PCR 
such that their selectivity comes from the 
presence of two to four random bases at the 3 ' 
ends of primers that anneal to the target DNA 
during the PCR (e.g., AFLP, below); and 3) 
markers amplified using two primers in PCR, 
which commonly requires cloning and/or se- 
quencing for the construction of specific prim- 
ers. In this case, variations in marker technol- 
ogy result from differences in the target DNA 
sequence present between two primers (e.g., 
AMP-FLPs, STRs, and SSRs). 

RAPDs. Of three similar, single-primer, 
PCR-based technologies, random amplified 
polymorphic DNA (RAPD; Williams et al., 
1990), DNA amplification fingerprinting 
(DAF; Caetano-Anolles et al., 199 1), and arbi- 
trary-primed PCR, (AP-PCR; Owen and 
Uyeda, 1991; Welsh and McClelland, 1990), 
RAPDs have been used most widely for map 
construction and linkage analysis (Reiter et 
al., 1992) (Fig. 1). RAPD markers are gener- 
ated by PCR amplification of random ge- 
nomic DNA segments with single primers 
[usually 10 nucleotides (nt) long] of arbitrary 
sequence (Williams et al., 1990). The primer/ 
target complexes are used as substrates for 
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DNA polymerase to copy the genomic se- 
quences 3' to the primers. Iteration of this 
process yields a discrete set of amplified DNA 
products thatrepresenttargetsequences flanked 
by opposite-oriented primer annealing sites. 
Amplification products can be separated by 
electrophoresis on agarose or polyacrylamide 
gels and visualized by staining with ethidium 
bromide or silver. RAPDs are usually domi- 
nant markers with polymorphisms between 
individuals defined as the presence or absence 
of a particular RAPD band (Fig. 2). The fur- 
ther development of RAPD methodology has 
produced other PRC-based markers (e.g., 
SCAR and ASAP markers). 

SCARs. Utility of a desired RAPD marker 
can be increased by sequencing its termini and 
designing longer primers (e.g. t 24 nt) for spe- 
cific amplification of markers (Paran and 
Michelmore, 1993). Such sequenced charac- 
terized amplified regions (SCARs) are similar 
to sequence-tagged sites (STS) (Olson et al., 
1989) in construction and application. DNA 
sequence differences are manifest by the pres- 
ence or absence of a single unique band. SCARs 



are more reproducible than RAPDs and can be 
developed into plus/minus arrays where elec- 
trophoresis is not needed. Although SCARs 
are usually dominant markers, some SCARs 
can theoretically be converted to codominant 
markers by digestion with 4-bp restriction 
endonucleases, and identification of polymor- 
phisms by either denaturing gradient gel elec- 
trophoresis (DGGE) or single-strand confor- 
mational polymorphism (SSCP) techniques 
(Rafalski and Tingey, 1993). 

ASAPs. A recent modification of PCR tech- 
nology involves the alkaline extraction of DNA 
with subsequent amplification of the DNA 
template in microliter plates using allele-spe- 
cific associated primers (ASAPs) that gener- 
ate only a single DNA fragment at stringent 
annealing temperatures (similar to SCARs) 
(Gu et al, 1 995), The DNA fragment is present 
in only those individuals possessing the ap- 
propriate allele and thus eliminates the need to 
separate amplified DNA fragments by electro- 
phoresis (i.e., presence/absence polymor- 
phism). This method involves ethidium bro- 
mide binding to the DNA double helix, which 



dramatically enhances its fluorescence but does 
not bind to free nucleotides in the PCR mix- 
ture. This approach was developed to decrease 
time for DNA extraction and increase the 
reliability of the PCR reaction for large-scale 
screening. 

SPARs. The single primer amplification 
reaction (SPAR) is a DNA marker system that 
can produce multiple/markers per assay (Fig. 
1). The system uses primers based on 
microsatellites or simple sequence repeats 
(SSRs) and amplifies inter-SSR DNA se- 
quences (Gupta et al., 1994). Of the di-, tri-, 
tetra-, and pentanucleotide SSRs, the 
tetranucleotide repeats are most effective in 
producing polymorphic multiband patterns. 
The level of polymorphism is related to the 
genomic diversity within agiven species. Most 
DNA markers map to scattered genomic loca- 
tions. Although most SSR-SPARs are domi- 
nant markers, codominant markers can also be 
detected (Fig, 2). Given that an unlimited 
number of primers can be synthesized from 
the tetranucleotide repeats [(4) 4 = 256], and 
from the combination of di-, tri-, and 
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tetranucleotide SSRs, or compound SSRs, the 
SSR-SPAR marker system may have broad 
application across a range of plant species. 

SSR-anchored PCR. This system employs 
single primers of dinucleotide simple sequence 
repeats (SSRs; see below); especially (CA)n 
repeats for amplification of markers. The 
primer is either anchored at 3' or 5' termini 
with two to four nucleotides (Zietkiewicz et 
al., 1994). Multiple bands containing inter- 
SSR regions are amplified and then are frac- 
tionated on polyacrylamide gels for pattern 
visualization. These amplified bands are mostly 
dominant markers and can be used in a wide 
range of plant species. 

AFLPs. Production of amplified fragment 
length polymorphisms (AFLPs) is based on 
selective amplification of restriction enzyme- 
digested DNA fragments (Zabeau and Vos, 
1 993) (Fig. 1 ). Multiple bands are generated in 
each amplification reaction that contains DNA 
markers of random origin. Analysis of DNA 
on denaturing polyacrylamide gels typically 
results in the production of 50 to 100 bands per 
individual sample. AFLPs are quantitative in 
that heterozygous and homozygous genotypes 
can be differentiated by the intensity of the 
amplified bands. The ability of this technology 
to generate many markers with minimum 
primer testing, and the system's high resolu- 
tion (i.e., band clarity and relatively low lane 
background) are features that make AFLPs 
attractive as genetic markers (primarily domi- 
nant; Fig. 2). Because of its expense, automa- 
tion may be required to realize this technology ' s 
full potential during MAS. 

AMP-FLPs, STRs, and SSRs. Mini- and 



RAPDs, SPARs etc 
AA AA AA aa AA AA AA AA 

Segregation ratio 3:1 No apparent segregation 

AFLPs 

AA Aa Aa aa AA Aa Aa AA 

Segregation ratio 1:2:1 Segregation ratio 1:1 



microsatellite DNA sequences are an excel- 
lent source of polymorphisms in eukaryotic 
genomes, and are well suited for genotyping 
and map construction. Marker systems based 
on such sequences include amplified fragment 
length polymorphisms (AMP-FLPs; 
minisatillites in vertebrates; Fregeau and 
Fourney, 1993), short tandem repeats (STRs; 
microsatellites in verebrates; Fregeau and 
Fourney, 1993), and simple sequence repeats 
(SSRs; microsatellites in plants; Rafalski and 
Tingey, 1993) (Fig. 1). Mini- and 
microsatellites are comprised of tandem ar- 
rays of 15- to 70-bp and 2- to 5-bp monomeric 
repeat units, respectively. Polymorphisms 
appear because of variation in the number of 
tandem repeats in a given repeat motif. Most 
STRs and SSRs are dinucleotide repeat-based 
[(AC)n, (AG)n, and (AT)n] microsatellite 
markers (Rafalski and Tingey, 1993). Such 
polymorphisms are amplified by designing 
primers from the sequenced regions flanking 
the repeat motifs (Fig. 1). Similar to (CA)n 
repeats in humans, (AT)n dinucleotide 
microsatellite repeats are relatively abundant 
and highly polymorphic in plants (Akkaya et 
al., 1992;Morganteetal., 1994). This group of 
markers is codominant in its expression (Fig. 
2). 

CAPs. Cleaved amplified polymorphic se- 
quences (CAPs) are a form of genetic varia- 
tion in the length of DNA fragments generated 
by the restriction digestion of PCR products 
(Koniecyzn and Ausubel, 1993; Jarvis et al., 
1994). The source of the sequence informa- 
tion for the primers can come from a genebank, 
genomic or cDNA clones, or cloned RAPD 



bands. This marker class is codominant in its 
behavior. 

MARKER SYSTEM SELECTION 

Selection of a DNA marker system for 
plant breeding depends on project objectives, 
population structure, the genomic diversity of 
the species under investigation, marker sys- 
tem availability, time required for analysis, 
and the cost per unit information (Table 1). 
Clearly, each marker system has advantages 
and disadvantages, and therefore it is critical 
to evaluate each marker system for its poten- 
tial utility before use. For example, intraspe- 
cific maps can be constructed with a common 
set of RFLP probes; however, each species 
initially requires the construction of a map. 
The range of polymorphism in species also 
plays a role in marker selection (e.g., in self- 
pollinated species RAPDs are more useful for 
detecting polymorphisms within a gene pool 
than RFLPs). Moreover, the use of a marker 
system in one species does not necessarily 
indicate its efficacy in another species. 

Marker systems also differ in their utility 
across populations, species, and genera, and 
their efficacy in the detection of polymor- 
phism. For instance, RFLPs mapped in one 
population can be used as probes for charac- 
terizing other populations within the same 
species. In contrast, SSRs can be as informa- 
tive as RFLPs, but polymorphic primers iden- 
tified in one species are generally not useful in 
another species. Likewise, maps using RAPDs, 
SPARs, and AFLPs can be constructed in a 
relatively short period; however, such mark- 
ers are not useable across populations, be- 
cause each marker is primarily defined by its 
length (i.e., sequence information may be lim- 
ited). Moreover, the same size band amplified 
across populations/species does not necessar- 
ily mean that bands possess the same se- 
quence, unless proven by hybridization stud- 
ies (Thormann et al„ 1994). In contrast to 
RFLPs, these marker systems possess all the 
advantages of PCR-based systems (i.e., small 
sample requirement, high throughput, and early 
selection) (Table 1). These advantages can be 
nullified if polymorphism within a species is 
low. Where the level of polymorphism is low, 
STRs and SSRs are currently the marker sys- 
tems of choice. However, the cost and time 
required to develop these marker systems can 
be considerable (Table 1). 

Costs per unit information (data point) 
depend on the time required for DNA extrac- 
tion (sampling), the amount of DNA needed 
for analysis, whether cloning and sequencing 
is necessary, the amount of potentially useful 
genetic information acquired, the type of ge- 
netic information needed, whether the allelic 
variation can be ascribed to banding patterns 
(dominant vs. codominant), whether the elec- 
trophoretic system can be automated, the po- 
tential utility of genetic maps, and the propri- 
etary status of the technique (Table 1). Codomi- 
nant markers, such as RFLPs, are useful for 
MAS and evolutionary studies, but their use 
can be time consuming, relatively expensive, 
and may require considerable technical exper- 



Fig. 2. Schematic of inheritance patterns of DNA markers in F 2 and BC, progenies. Common bands shown 
in the patterns of the parents and progeny may not be seen with other probes and primers. RFLPs, AMP- 
FLPs, STRs, and SSRs are codominant markers and thus heterozygousity in F 2 and BC, is easily detected. 
In contrast, RAPDs, SPARs, DAPs, AP-PCR, and AFLPs are dominant markers and detection of 
heterozygosity is, with rare exception, not possible. Zygosity determination is also possible through the 
quantification of DNA bands (e.g., AFLPs). 
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Table I . Comparisons among several molecular marker systems for various technical attributes and proprietary rights status. 



Molecular marker systems 1 



Critical variables 
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100 
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Cloning and sequencing 


Yes 


No 


No 


No 


No 


Yes 


Information content per run 1 


0-3 


0-30 


0-20 


0-4 


0-10 


0-2 


Marker type* 


C 


D 


D 


D 


D 


C 


Zygosity detection* 


Yes 


Yes 


No 


No 


No 


Yes 


Automation 0 


+ 


+++ 


++ 


++ 


++ 


+++ 


Utility of genetic maps' 


SS 


CS 


CS 


CS 


CS 


SS 


Proprietary rights status 1 


. NA 


LC 


NA 


LC 


LC 


NA 



'RFLP = restriction fragment length polymorphism, AFLP = amplified fragment length polymorphism, SPAR = single primer amplification reaction, RAPD = 
random amplied polymorphic DNA, DAF = DNA amplification fingerprinting, and SSR/AMP-FLP = simple sequence repeats/amplified fragment length 
polymorphism. 

HThe sampling time after sowing is shown for corn (Zea mays) (relative time applies to all crop species). 

'Markers obtained per hybridization or PCR reaction. For SSR/AMP-FLP, number of markers per run does not reflect multiplexing. 
W D and C equal dominant and codominant markers, respectively. 
v Heterozygous alleles can be distinguished from the homozygous alleles. 

u 6n the scale of 1 to 3, + = the least and +++ shows the most potential for automation. Automation refers to mechanizing 
steps involving processing of DNA and detection, identification and scoring of markers. 

'Refers to the relative utility of maps constructed with a given marker system either within that species (SS = species specific) or to a 

specific cross or population (CS = cross specific). 

*NA = not applicable, LC = license required to practice the technology. 



tise. Often the high cost of developing an 
RFLP marker system for a new species or 
inefficiency of MAS, due to the large quantity 
of DNA required and slow screening process, 
results in a decision to use a PCR-based sys- 
tem. Nevertheless, RFLPs command an ad- 
vantage in systematic and evolutionary stud- 
ies because, in contrast to most PCR-based 
technologies, which detect variation in 20-40 
bases (combined length of primers), data 
aquisition is based on the homology among 
large fragments of DNA (length of probes). 
The use of RFLPs for such studies is enhanced 
if polymorphisms are abundant. 

MAP CONSTRUCTION 

The development of molecular marker tech- 
nology and consequent identification of many 
marker loci has caused renewed interest in 
genetic mapping. Genetic map construction 
requires that the researcher: 1) select the most 
appropriate mapping population(s); 2) calcu- 
late pairwise recombination frequencies using 
these population(s); 3) establish linkage groups 
and estimate map distances; and 4) determine 
map order. Since large mapping populations 
are often characterized by different marker 
systems, map construction has become com- 
puterized. Computer packages such as Link- 
age 1 (Suiter et al., 1983), GMendel (Echt et 
al., 1992), Mapmaker (Lander and Botstein, 
1986; Lander et al., 1987), MapManager 
(Manly and Elliot, 1991), and JoinMap(S tarn, 
1993) have been developed to aid in the analy- 
sis of genetic data for map construction. These 
programs use data obtained from segregating 
populations to estimate recombination fre- 
quencies that are then used to determine the 
linear arrangement of genetic markers by mini- 
mizing recombination events. 

Mapping populations 

Selection of mapping populations is criti- 
cal to successful map construction. Since a 



map's economic significance will depend upon 
marker-trait associations, as many qualita- 
tively inherited morphological traits as pos- 
sible should be included in genetic stocks 
chosen as parents for map construction (Table 
2). Consideration must be given to the source 
of parents (adapted vs. exotic) used in the 
mapping population. Chromosome pairing and 
recombination rates can be severely disturbed 
(suppressed) in wide crosses (adapted X ex- 
otic) and generally yield greatly reduced link- 
age distances (Albini and Jones, 1987; Zarnir 
andTadmor, 1986). Wide crosses will usually 
provide segregating populations with a rela- 
tively large array of polymorphisms when 
compared to progeny segregating in a narrow 
cross (adapted x adapted). To have significant 
value in plant improvement programs, a map 
made from a wide cross must be colinear (i.e., 
order of loci similar) with maps constructed 
using adapted parents. 

The choice of an appropriate mapping popu- 
lation depends on the type of marker systems 
employed (Tanksley et al., 1988). Maximum 
genetic information is obtained from a com- 
pletely classified F 2 population using a codomi- 
nant marker system (Mather, 1938). Informa- 
tion from a dominant marker system can be 
equivalent to a completely classified F 2 popu- 
lation if progeny tests (i.e., F 3 or F 2 BC) are 
used to identify heterozygous F 2 individuals. 
This procedure is often prohibitive because of 
the cost and time involved in progeny testing. 

Dominant markers supply as much infor- 
mation as codominant markers in recombinant 
inbred lines (RI) (i.e., an array of genetically 
related lines; usually >F 8 ), doubled haploids, 
or backcross populations in coupling phase 
(Burret al., 1988). Information obtained from 
dominant markers can be maximized by using 
RI or doubled haploids because all loci are 
homozygous, or nearly so. Under conditions 
of tight linkage (i.e., about <10% recombina- 
tion), dominant and codominant markers evalu- 
ated in RI populations provide more informa- 
tion per individual than either marker type in 



backcross populations (Reiter et al., 1992). 
However, as the distance between markers 
becomes larger (i.e., loci become more inde- 
pendent), the information obtained per unit 
individual in RI populations decreases dra- 
matically when compared to codominant mark- 
ers. 

Backcross populations can be useful for 
mapping dominant markers if all loci in the 
recurrent parent are homozygous and the do- 
nor and recurrent parent have contrasting poly- 
morphic marker alleles (Reiter et al., 1992). 
Information obtained from backcross popula- 
tions using either codominant or dominant 
markers is less than that obtained from F 2 
populations because one, rather than two, re- 
combinant gametes are sampled per plant. 
Backcross populations, however, are more 
informative (at low marker saturation) when 
compared to RIs as the distance between linked 
loci increases in RI populations (i.e., about 
>15% recombination). Increased recombina- 
tion can be beneficial for resolution of tight 
linkages, but may be undesirable in the con- 
struction of maps with low marker saturation. 

Progeny testing of F 2 individuals is often 
used in map construction where phenotypes 
do not consistently reflect genotype (e.g., dis- 
ease reaction and many useful traits) or where 
trait expression is controlled by QTLs. Segre- 
gation data from progeny test populations (e.g. , 
F 3 or F 2 BC) can be used in map construction. 
MAS can then be applied to cross progeny 
based on marker-trait map associations, espe- 
cially in early generations (F 2 , F 3 ), where link- 
age groups have not been completely disasso- 
ciated by recombination events (i.e., maxi- 
mum disequilibrium). 

Recently, a method has been developed for 
the rapid identification of linkage using bulked 
segregant analysis (BSA; Michelmore et al., 
199 1). In BSA, two bulked DNA samples are 
drawn from a segregating population originat- 
ing from a single cross. These bulks contain 
individuals that are identical for a particular 
trait (e.g., resistant or susceptible to a particu- 
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lar disease) or genomic region but arbitrary at 
all unlinked regions (i.e., heterozygous). The 
bulks are screened for DNA polymorphisms 
and these differences compared against a ran- 
domized genetic background of unlinked loci. 
Thus, differences between the two bulks indi- 
cate markers (e.g., bands) that are linked to a 
particular trait. BSA overcomes several prob- 
lems that are associated with the use of nearly 
isogenic lines (NILs), which require many 
backcrosses to develop. Where only a portion 
of the polymorphic loci are expected to map to 
a selected region using NILs (e.g., BC 5 only 
50%), regions unlinked to the target region 
will not differ between the bulked samples of 
many individuals in BSA. Moreover, all loci 
detected during BSA will segregate and can be 
mapped, thus eliminating the linkage drag 
problems (i.e., genes incorporated into lines 
by backcrossing that are flanked by DNA seg- 
ments introduced from the donor parent) asso- 
ciated with NILs (Young andTanksley, 1989). 

Calculation of recombination fraction 

Crossover events can be described as the 
percentage of recombination in offspring. Only 
half the meiotic products will be crossover 



types (recombinants) when one chiasma forms 
between two loci. Multiple crossovers can 
also be detected through observation of prog- 
eny phenotypes. Single crossover events are 
not independent and the number of double 
crossover events is usually smaller than pre- 
dicted. This positive "interference" varies, 
depending on organism, crossover location, 
environmental factors, and numerous other 
factors. Therefore, accurate estimates of double 
crossing over can only be obtained when inter- 
ference is considered. Interference is mea- 
sured as a coefficient of coincidence (CC), 
which is an expression of the ratio of observed 
double crossovers to those predicted (expected) 
by a map. The expected double crossover 
frequency is predictable if two crossovers are 
independent events or if interference can be 
measured. 

The proportion of mean number of recom- 
bination events defines the map distance be- 
tween two loci. The relationship between map 
distance and recombination value is charac- 
terized by a genetic mapping function (mf). An 
mf is a formula expressing quantitative rela- 
tionships between distances in a linkage map 
using crossover frequency. There are several 
types of mapping functions that can be ap- 



plied, depending on the assumed degree of 
crossover interference that best represents the 
mapping population. The most common map- 
ping functions were developed by Haldane 
(1919) and Kosambi (1944). While Haldane 1 s 
mf assumes absence of interference, Kosambi' s 
assumes positive interference (i.e., fewer 
double recombinants when compared to no 
interference). 

The frequency of recombinant gametes 
produced can be used as an index of the dis- 
tance between two loci on a chromosome [1 
map unit = about 1 cM]. Map distance, how- 
ever, is not completely additive. Additivity is 
based on the assumption that the average num- 
ber of crossovers per chromatid occurring 
between two loci is directly proportional to the 
distance between the two loci. The frequency 
of recombination (percentage) and map dis- 
tance is, however, not directly proportional. 
Estimates of the frequency of crossing over 
will be most reliable when genes are relatively 
closely linked ( 1 to 10 map units). Recombina- 
tion percentage is only equivalent to map 
distance within the range of the minimum 
distance for crossing over (lack of additivity) 
because double crossovers occur at significant 
frequency. A nonlinear relationship occurs 



Table 2. Total map length and mean distance between genetic markers in various plant species. 



name 


L>\J Ullll W Oil 

name 


Pop. 1 


Marker 
Type' 


No. 


Map 
Mean 

Length (cM) distance (cM) 


Reference 


Arabidopsis 


A. thaliana L. 


RI 


RFLP 


320 


630 


2.0 


Reiteret al., 1992 


Banana 


Musa acuminata Col la 


F 2 


RFLP, isozyme, RAPD 


90 


606 


10.0 


Faureetal., 1993 


Barley 


Hordeum vulgare L. 


DH 


RFLP, isozyme, RAPD, 


295 


1250 


4.2 


Kleinhofs et al., 1993 




morphological, SAP, 
















disease 










Bean 


Phaseolus vulgaris L. 


BC 


RFLP, isozyme, seed 


244 


1200 


=5.0 


Vallejos et al., 1992 






protein, flower color 














. F 2 


RFLP, RAPD, isozyme 


152 


827 


6.5 


Nodarietal., 1993a 


Brassica 


B. napus L. 


F 2 


RFLP, isozyme, RAPD, 


120 


1413 


14.0 


Landry et al., 1991 






morphology, disease 












B. napus L. 


F,-DH 


RFLP 


132 


1016 


7.7 


Ferreira et al., 1994 




B. rapa L. (syn. campestris) 


F 2 


RFLP 


280 


1850 


6.9 


Songetal., 1991 




B. rapa 


F 3 


RFLP, seed color, 


139 


1785 


13.5 


Teutonico & Osborn, 1994 






seed erucic acid, 
















pubescence 








Jarrell et al., 1992 


Citrus 


C grandis L. x C. parodist Maef. 


Fi 


RFLP 


46 


1700 


20.0 


Cotton 


Gossypium hirsutum L. X 


F 2 


RFLP 


705 


4675 


7.1 


Reinisch et al., 1994 




G. barbadense L. 














Cucumber 


Cucumis sativus L. 


F 2 


RFLP, RAPD, isozyme, 


58 


766 


8.0 


Kennard et al., 1994 








morphology, disease 












C. sativus X C. hardwickii (R.) Alef. 


F 2 


RFLP, morphology 


70 


480 


/ 8.1 




Cuphea 


Cuphea lanceolata Alt 


F 2 


RFLP 


37 


288 


7.8 


Webbetai., 1992 


Maize 


Zea mays L. 


RI 


RFLP 


334 


1460 


»5.0 


Burret al., 1988 


Potato 


Solanum tuberosum L. X 


BC 


RFLP, isozyme 


977 


684 


0.7 


Tanksley etal., 1992 




S. berthaultii Con. 














Rice 


Oryza sativa L. X 


BC 


RFLP 


726 


1491 


2.0 


CausseetaL, 1994 




0. longistaminata A. Chev. & 
















Roehr 














Rye 


Secale cereale L. 


IBL 


RFLP, isozyme, RAPD, 


60 


350 


6.0 


Philipp et al., 1994 






morphology, physiology 








Sorghum 


S. bicolor L. 


F 2 


RFLP 


98 


949 


10.0 


Whitkus etal., 1992 




F 2 


RFLP 


190 


1789 


9.4 


Xu etal., 1994 


Soybean 


Gylcine max L. 


F 2 


RFLP 


252 


2147 


8.5 


Diersetal., 1992 


Sugar beet 


Beta vulgaris L. 


F 2 


RFLP 


115 


789 


6.9 


Pillen etal., 1992 


Tomato 


Lycopersicon 


F 2 


RFLP, isozyme 


1030 


1276 


1.2 


Tanksley et al., 1992 




esculentum Miller x L pennelli 















*RI = recombinant inbred, DH = doubled haploid, and IBL = inbred line. 

y RFLP = restriction fragment length polymorphism, RAPD = random amplified polymorphic DNA, and SAP = specific amplicon polymorphism. 
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when estimates are made outside of this dis- 
tance. Therefore, the actual map distance be- 
tween two genes will tend to be underesti- 
mated by the recombination fraction (r) (e.g., 
r=0.10=10cMs,r=0.30=45.8cMs,r=0.35 
= 60.2 cMs), such that at large distances (~ 40- 
50 map units) the two genes will be strictly 
independent of each other (Kosambi, 1944). 

Linkage phase 

Genes are linked when they are on the same 
chromosome. There are two possible arrange- 
ments of two genes on a pair of chromo- 
somes — coupling and repulsion. Coupling sig- 
nifies that the two recessive alleles are carried 
in one chromosome and the two dominant 
alleles in the other (i.e., AB/ab); repulsion 
describes the alternate arrangement (i.e., Ab/ 
aB). This relationship is particularly impor- 
tant when dealing with dominant markers dur- 
ing map construction. Two linked markers 
scorable as dominant alleles [e.g., AA or band 
presence (+)] can only be recognized in cou- 
pling phase linkage. This is because the het- 
erozygote class cannot be distinguished from 
the homozygote dominant class (i.e., band 
presence = AA or Aa t band absence = ad). In 
contrast, codominant markers allow for the 
expression of both pairs of alleles (i.e., pheno- 
typically as AA, Aa, aa). Linkage phase has 
proven important in selection when dominant 
markers are used (Haley et al., 1 994). A greater 
proportion of bean (Phaseolus vulgaris L.) 
genotypes homozygous resistant to bean com- 
mon mosaic virus (BCMV), and a lower pro- 
portion of segregating and homozygous sus- 
ceptible genotypes were recovered when se- 
lection was imposed against a repulsion-phase 
RAPD marker than when selection was made 
for the coupling-phase RAPD marker. This 
observation is of practical significance where 
resistance is conditioned by recessive genes 
since it requires breeders to select against the 
heterozygous susceptible individuals (Kelly, 
1995). In the case of selection for the recessive 
BCMV resistance gene (bc-3) in bean, Kelly 
stated: "selection of individuals based on the 
phenotype of combined coupling and repul- 
sion-phase RAPD markers was equivalent to 
selection based on a codominant marker 
(RFLP) and was identical to selection based 
on the repulsion-phase marker alone." 

Establishment of linkage groups 

If linkage is indicated by Chi-square analy- 
sis of progeny segregation, then the potential 
for linkage between loci can be mathemati- 
cally tested. There are several mathematical 
methods available for investigating potential 
linkage relationships (Crow, 1990). Among 
these are the maximum likelihood and least 
squares/regression methods. Currently, these 
are the methods of choice for linkage estima- 
tion because they result in estimates that have 
the smallest standard error (Mather, 1938; 
Nordheim et al., 1984). They are especially 
useful where multiple loci (QTL) are involved 
(Shute, 1988). While least squares estimation 
attempts to minimize deviations from a math- 



ematical model (regression), maximum likeli- 
hood involves comparisons among two or 
more plausible hypotheses (e.g., linkage vs. no 
linkage). The maximum likelihood method is 
particularly useful in evaluating genetic phe- 
nomena and will be used in this discussion as 
an illustration of linkage analysis (Chakravarti 
etal., 1991). 

Maximum likelihood and likelihood odds 
ratio (LOD) value 

Mather (1938) developed the maximum 
likelihood approach for linkage analysis. It is 
used by various computer-based linkage pro- 
grams (e.g., Mapmaker; Lander et al., 1 987) to 
determine the probability of linkage between a 
given marker and a known marker. Maximum 
likelihood is a statistical procedure designed 
to choose values for variables that maximize a 
defined function, which is done by integrating 
the function and solving for 0 (minimizing the 
integrated function) or by iteration. 

Linkage estimation using the method of 
maximum likelihood is based on the binomial 
expansion, which is a special case of the poly- 
nomial (m x + m 2 + m 3 + ... )". Maximum 
likelihood, as applied to linkage estimation, 
attempts to select a linkage estimator (r value) 
that minimizes an expectation function in a 
binomial expression. The benefit of maximum 
likelihood in the calculation and estimation of 
r is that functions can be designed that include 
ambiguous classes. An example of an ambigu- 
ous class is the double heterozygote (AaBb) of 
an F 2 family that contains recombinant and 
non-recombinant types. 

Recombination value is used in the maxi- 
mization expression to determine the likeli- 
hood (L) of association of linkage between a 
set of variables (i.e., genetic loci). The value of 
r that maximizes the likelihood of the ob- 
served outcome is determined. Solutions are 
limited to the range of 0 to 0.5. 

After maximization, the question is raised 
as to whether the value of r, say x, is signifi- 
cant, given the upper limit of no linkage (i.e., 
0.5) — that is, whether the probability that two 
loci are linked with a given r value over the 
probability that the two loci are not linked. An 
understanding of the precision of r is neces- 
sary to assess the utility of the value obtained 
by likelihood maximization. Historically, this 
has been done in two ways. Allard (1956) 
constructed a series of tables and formulae to 
calculate recombination values and associated 
standard errors using a maximum likelihood 
approach. This approach had already been 
widely used by plant geneticists (Fisher, 1 946; 
Kramer and Burnham, 1947). Researchers in 
human genetics gravitated toward the use of 
the LOD defined by Haldane and Smith ( 1 947). 
This approach has been used by some plant 
researchers because the LOD calculations 
needed for analyzing large populations and 
using many markers has been simplified by 
the use of some computer-based linkage esti- 
mation programs (e.g., Mapmaker; Lander 
andBotstein, 1986, Lander etal., 1987). 

The odds ratio of a maximization event is 
given as: L (x)/L (0.5). This form, however, is 



inconvenient in most instances and the log of 
the odds ratio (i.e., LOD = log [L (x)/L (0.5)] } 
is used (Risch, 1992). In many analyses, a 
significance level of LOD > 3.0 is appropriate 
as an acceptance level of linkage between two 
loci. This value is equivalent to saying that the 
alternative hypothesis (linkage) has to be 
greater than 1000 times more probable than 
the null (no linkage) hypothesis. If this analy- 
sis is repeated over 100 marker loci, a signifi- 
cant level of LOD > 3 for each locus is compa- 
rable to an experiment-wise (genome-wise) 
type I error rate of alpha (a) = 0.0 1. LOD 
decreases with increasing r values and in- 
creases with increasing sample size. 

Tests of linkage for qualitatively inherited 
traits vary in scope and operation. The re- 
searcher must determine a threshold LOD 
value below which linkage is not considered 
significant (Churchill and Doerge, 1994). As 
the LOD threshold is raised, fewer markers are 
assigned to linkage groups (i.e.. independent 
loci), and more and smaller linkage groups are 
identified. Comparison of maps created from 
an array of LOD values often allows the re- 
searcher to determine the stability of putative 
linkage groupings. It is clear that any map only 
approximates reality and that map distances 
between markers will change as new informa- 
tion (i.e., more markers) becomes available. 

Gene order determination and map 
merging 

Because additivity of map distance is ac- 
cepted (assuming no double crossovers occur) 
for narrow intervals (1-10 map units), tightly 
linked genes can be placed in relative order. 
Genes that are loosely linked (>20 map units) 
can be placed on a map but their location is 
much more tentative. The map distances cal- 
culated based on crossover percentages (i.e., 
genetic map) often bear no direct relationship 
with the actual physical distances between 
linked genes (i.e., physical map) (Stansfield, 
1969; Swanson et al., 1990). The linear order 
in the physical and genetic maps, however, 
should theoretically be identical. 

Three linked genes may be in any one of 
three orders, depending on which gene is in the 
middle of the linkage group. Traditionally, 
gene orders have been determined from either 
two- or three-point testcross data. When mul- 
tiple crossovers occur with much greater than 
random frequency (i.e., localized negative in- 
terference), gene order of closely linked sites 
can be ascertained using three-factor recipro- 
cal crosses. 

Genetic maps in several crop species have 
been constructed using various marker sys- 
tems, types of populations, and, often, genera- 
tions. Although selected data for several crops 
are presented for comparison (Table 2), de- 
tailed and updated information on these and 
other species resides in genome databases 
housed in the U.S. National Agricultural Li- 
brary, Beltsville, Md. Maps in many species 
are moderately saturated and incorporate 
isozymes, RFLPs, and RAPDs. There are at 
least three maps for potato (Solatium tuberosum 
L.) and rice (Oryza sativa L.), and two for bean 
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and pepper {Capsicum annuum L.) that have 
been constructed using various parents ana- 
lyzed in diverse generations. Likewise, mul- 
tiple maps are being developed for other plant 
species [e.g., Arabidopsis thaiania (L.)Heynh., 
corn {Zea mays L.), and tomatoes 
{Lycopersicon esculentum Mill.)]. 

If two or more genetic maps possess a 
minimal number of common markers they can 
be merged to create a more informative map 
(Hauge et al., 1993). However, the type of 
information (e.g., F 2 vs. BC) and precision of 
estimates of recombination frequencies (fam- 
ily size) often vary greatly between popula- 
tions and data sets. Therefore, any procedure 
that attempts to merge mapping information 
must "weigh" these types of information to 
create the "optimal," "most likely" map with 
the least amount of "internal tension." 

A computer program, JoinMap, has re- 
cently been developed that considers the esti- 
mates of recombination frequency between a 
given pair of markers of different origin (data 
sets/mapping populations), calculates and ap- 
plies the appropriate weighting, and then gen- 
erates a single recombination value (Stam, 
1993). After assigning weights to all available 
pairwise combinations, JoinMap institutes a 
numerical search for the best-fitting linear 
arrangement of the marker loci. JoinMap cal- 
culates a goodness-of-fit criterion correspond- 
ing to the two hypothesized levels of interfer- 
ence (positive and negative), allowing for an 
examination of each synthesized map. 

Identification of QTLs 

In contrast to classical linkage detection 
for single gene traits, different strategies have 
been suggested for the identification (i.e., de- 
tection and localization) of single QTLs 
(Edwards et al., 1987; Jiang and Zeng, 1995; 
Lander and Botstein, 1989). Such strategies 
attempt to identify major levels of the total 
genetic variance that contribute to a trait's 
variation. They differ in approach in the num- 
ber of markers that they evaluate during link- 
age estimation. Tests for QTL/trait association 
can involve the evaluation of one marker at a 
time, two marker loci simultaneously, or the 
consideration of all possible marker loci at 
once. Typical of a one-marker comparison 
strategy is the use of the one-way analysis of 
variance (F test) for the analysis of BC prog- 
eny, and marker genotype means comparisons 
(r test) for BC and F 2 populations (Soller et al., 
1976; Stuber et al., 1992). This approach ig- 
nores the potential recombination between a 
marker and a QTL, and thus will lead to an 
underestimation of QTL effects if the marker 
and QTL are not coincident (Edwards et al., 
1987). A single marker approach may also 
incorporate a trait-based analysis in which 
individuals in the tails of a population distribu- 
tion are sampled for marker frequencies 
(Lebowitz et al., 1987). In this case, those 
markers lying between the tails of the distribu- 
tion and differing in frequency are assumed to 
be associated with the QTL affecting the trait. 

Approaches which examine two marker 
loci at once incorporate interval mapping strat- 



egies using maximum likelihood for the analy- 
sis of single QTLs flanked by a pair of marker 
loci (Lander and Botstein, 1989; Paterson et 
al., 1991). The interval approach was devel- 
oped to take advantage of additional informa- 
tion provided by linkage maps having a rela- 
tively high degree of genome saturation (i.e., 
spacing of markers every 5-20 cM) such as 
tomato and maize (Paterson et al., 1 99 1 ; Doebly 
and Stec, 1991). The interval approach allows 
for the estimation of putative QTL effects at 
any location within a marker interval based on 
the means and variances observed in the marker 
classes and the recombination frequency be- 
tween the markers bracketing a particular in- 
terval (Lander and Botstein, 1989). This ap- 
proach is partially limited by its inability to 
test unlinked markers, and to accurately locate 
QTLs beyond the terminal markers of a given 
linkage group. 

The consideration of all possible marker 
loci at once during QTL analysis is complex 
and involves the regression of trait expression 
on multiple marker locus values (Co wen, 1 989; 
RodolpheandLefort, 1993; Stam, 1993). More 
recently, interval mapping and multiple re- 
gression have been integrated ("hybrid" ap- 
proaches) to more accurately describe QTL/ 
trait associations (Haley and Knott, 1992; 
Jansen, 1992, 1993; Jansen and Stam, 1994; 
Knapp, 1991; Knapp et al., 1990; Martinez 
and Cumow, 1992; Moreno-Gonzales, 1992; 
Zeng, 1993, 1994). Regardless of the mapping 
approach used, the success of MAS depends 
on the ability to detect QTLs and the consis- 
tency of QTLs over environments and genera- 
tions (Dijkhuizen, 1994; Lande and Thomp- 
son, 1990; Shoemaker etal., 1994). 

APPLICATION OF MARKERS 

Genetic markers have been used effec- 
tively in genetic diversity analysis and germ- 
plasm organization [e.g., Arachis (Lanham et 
al., 1992), Brassica (dos Santos et al., 1994; 
Thormann et al., 1994), Vaccinium (Novy et 
al., 1994)]; in genetic similarity estimation as 
predictors of hybrid performance (Bernardo, 
1994; Melchinger et al., 1990; Smith et al., 
1990); in genetic map construction for the 
localization of loci conditioning simply inher- 
ited traits [e.g., Pto locus for resistance to 
Pseudomonas syringae pathovar tomato (Pst) 
(Carland and Staskawicz, 1993); HI gene for 
resistance to Globodera rostochiensis (Woll.) 
Behrens in potato (Gebhardt et al., 1993); er- 
1 for resistance to powdery mildew in peas 
(Pisum sativum L.) (Timmerman et al., 1 994); 
downy mildew resistance in lettuce {Latuca 
sativa L.) (Paran and Michelmore, 1 993); pho- 
toperiod-sensitivity gene in rice (Mackill et 
al., 1993)]; and QTL analysis (Edwards et al., 
1987; Table 3). Marker systems also provide 
the potential for map-based cloning of specific 
genes (Tanksley et al., 1995). Although theo- 
retical appraisals of MAS have shown that it 
could be useful in plant improvement, the 
application of MAS has not been rigorously 
evaluated in many crop species (Lande and 
Thompson, 1990). To date, no cultivar devel- 
oped through MAS has been publicly released. 



Theoretical considerations and computer 
simulation 

Theoretical investigations that probe the 
potential of MAS are of academic and practi- 
cal importance. Although there are three gen- 
eral kinds of selection (stabilizing, directional, 
and disruptive) that could be used by plant 
breeders, directional selection is preferred be- 
cause selected phenotypes are distinct from 
the initial population for economically impor- 
tant attributes. Truncation selection is the sim- 
plest type of directional selection. During trun- 
cation selection, a phenotypic value is identi- 
fied as the lower selection limit (truncation 
point) and individuals are recovered whose 
phenotypic values are equal to or beyond this 
value. A prediction equation for response to 
truncation selection can be defined in terms of 
response to selection (R; difference in mean 
phenotype between the progeny generation 
and the previous generation), heritability (rr), 
and the selection differential (S; difference in 
mean phenotype between the selected parents 
and the initial population mean) as: R = h 2 S. 
Thus, realized heritability can be estimated in 
the first generation of purely phenotypic se- 
lection as: h 2 = R/S. Selection intensity (i) or 
selection differential is often expressed as 
units of standard deviations (s) in phenotypic 
value such that i = S/s. 

Computer-based simulation can allow for 
tentative interpretation of relatively complex 
genetic comparisons that have not been previ- 
ously possible (Edwards and Page, 1 994; Lande 
and Thompson, 1990). Using simple relation- 
ships (e.g., R = h 2 S) and theoretical assump- 
tions of variance components in an initial 
population, Lande and Thompson ( 1 990) pro- 
posed a computer-based simulation model for 
MAS to estimate genetic effects and gain from 
index selection. The model provides theoreti- 
cal estimates for response from truncation 
selection for QTLs in an F 2 population using a 
100-marker loci. The model derives selection 
indices that maximize the rate of improvement 
in quantitative characters under various meth- 
ods of MAS. The model takes into account 
epistasis by combining multiplicative 
(multivariant) and classical additive approxi- 
mations of gene action. Selection is based on 
an index that incorporates phenotypic and 
molecular information. The model uses the 
linkage disequilibrium between molecular 
marker loci .and quantitative trait loci (QTLs) 
in populations created by a cross between two 
inbred lines. 

Various strategies for plant improvement 
were tested by Lande and Thompson (1990) 
using computer simulations to characterize 
MAS and to provide expectations for pheno- 
typic selection. Potential increases in breeding 
efficiency through MAS and the population 
size needed to attain such increases depends 
on the genetic parameters (i.e., heritability, the 
proportion of the additive genetic variance 
explained by the marker loci) and the selection 
method used. Gain from selection (AG) of 
quantitative traits based on estimated additive 
effects could be greater for MAS than for 
phenotypic selection. The relative worth of 
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Table 3. Estimated number of quantitative trait loci (QTLs) affecting the expression of traits in several crops. 



Crop 

Common bean 



Population 



Trait 



Range (%) Total 
QTL of explained LOD phenotypic a 2 
(no.) phenotypic a 2 * (range) explained (%) 



Reference 



{Phaseolus 
vulgaris L.) 

Corn 
{Zea mays L.) 



F 2 :F 3 



F 3 BC, 
F 3 BC 2 

F 3 :F/ 



F 2 - 



Cowpea 
(Vigna 
unguicuiata L. 
Potato 
{Solarium 
tuberosum L.) 



Tomato 
(lycopersicon 
spp.) 



) F 2 



BC 



F| 
F, 
F, 



BC 



F 2 
F 2 



F 2 :F 3 



Nodule number 

Resistance to common blight 

Grain yield 
Grain yield 

European corn borer resistance 
Plant height 
Ear height 
Plant height 
GDD to anthesis* 
GDD to silk delay 
GDD to silk emergence 
Number of cupules in single rank 
Tendency of ear to shatter 
Hardness of outer glume 
Average length of internodes v 
Number of branches" 
Percent cupules lacking spikeler* 
Number of ears on lateral branch 
Percent male spikelets 1 
Number of rows of cupules 



Seed weight 



Type A trichome browning reaction 

Type A trichome density 

Type A trichome polyphenol oxidase cone. 

Type B trichome sucrose ester levels 

Type B trichome density 

Late blight 

Tuber shape 

Chip color 



Fruit mass 
Soluble solids 
Fruit pH 
Insect resistance 
Days to first true leaf 
Days to first flower 
Plant height 

Total number of flower buds 
Number of internodes on primary stem 
Total number of internodes 
Number of well developed branches 
Total plant fresh weight 
Total plant dry weight 
Soluble solids 
Fruit mass 
Fruit pH 



4 


1.5-2.8 


1 1.0-17.0 


50.0 


NnHari *»t al 1 OQ7K 
l^UUaH CI <u., lyyjo 


4 


2.1-6.0 


3.5-9.2 


75.0 




8 


5.6-14.4 


^» u.jj- iu.oo 




i> tuber et ai., iyyz 


6 


6.2-18.0 


' 1 1 fi_0 7^ 

J, 1 O— 7. / J 






7 


2.3-9. 1 


1 J, 1 


Jo.U 


c>cnon et al., 1993 


3 


5.7-12.9 


1\J.J—JH. 1 


At n 




5 


6.3-27.8 


1— J.J 


f\\ 7 


velaboom et al., 1994 


5 


6.4-39.5 


1 7_R 7 


A7 1 
O/. 1 




6 


2.1-5.9 


6.2-33.6 


62.9 


— 


2 


2.8-3.2 


15.8-17.5 


j\j.y 




5 


2.4-11.3 


7.8-53.1 


80.9 




6 


4.1-24.6 


2.6-11.0 




Doebley and Stec, 1991 


6 


4.3-41.7 


2.4-18.6 






2 


17.5-62.4 


7.6^40.6 


— 


— 


5 


4.7-45.3 


3.0-1 1.7 






4 


4.3-24.3 


2.8-7.6 






5 


8.0-25. 1 


2.9-9.9 






7 


6.3-24.5 


2.8-12.9 






5 


5.0-22.5 


3.0-15.9 






6 


5.O-36.0 


2.8-15.9 






4 




32.0-37 


53.0 


Fatokun et al 1 0Q7 


2 


20.2-52.0 


6.5-22. 1 


DJ.*f 


oonierbale et al., 1994 


1 


32.0 


1 1.1 


32.0 




2 


13.2-23.1 


2.9-6.3 


27.0 




5 


6.1-49.4 


2.0-19.2 


67.6 




2 


8.6-35.4 


7 Q_1 A 7 


JO. I 


... 


19 








Leonards-Schippers et al., 1994 


I» 






fj.KT 


van fcck et al., 1994 


g 




SI 14 O 

j. i— i*f.y 


j l.U 


Douches and Freyre, 1994 


5 






CO A 

58.0 


Paterson et al., 1988 


4 






^l/i ft 
44. U 




5 










3 








Nienhuis etal., 1987 


3 


2.6-13.1 


2.4-11.9 


18.0 


de Vicente and Tanksley, 1993 


7 


3.5-10.2 


2.2-8.0 


43.0 




9 


3.1-8.4 


2.2-8.2 


42.0 




10 


2.8-34.0 


2.2-34.1 


61.0 




5 


4.8-7.4 


3.2-5.71 


23.0 




9 


3.7-12.8 


2.8-10.8 


52.0 




8 


3.1-9.5 


2.7-7.5 


53.0 




2 


3.0-4.4 


2.2-2.74 


7.0 




5 


3.2-7.0 


2.3-3.5 


21.0 




7 


3.0-12.0 


6.0-28.0 


44.0 


Paterson et al. f 1991 


11 


2.3-21.5 


4.0-42.0 


72.0 




9 


2.4-6.1 


4.2-28.0 


34.0 





. *Given on a per locus basis. 
*F 2 classified by F 3 families. 
x Growing degree days. 
w 2eo mays; corn x teosinte. 
v In primary lateral branches. 
"In primary lateral inflorescence. 
'Pedicellate spikelet 
'Multiple alleles were detected. 
Total genetic variance. 

MAS is greatest for characters with low heri- 
tability when additive genetic variance is asso- 
ciated with the marker. More recently, 
Gimelfarb and Lande (1994b) have demon- 
strated that this same logic could be applied to 
nonadditive characters. 

The simplifying assumptions of computer- 
based models, however, can lead to over- or 
under-estimations of R. For instance, fitness 



plays an integral part in response of individu- 
als to selection. Heterotic advantage can be 
defined in terms of fitness. In many species, 
heterosis is pivotal to the expression of aver- 
age fitness in a population. Estimates of R are 
only valid if an individual's "fitness" is inter- 
preted as the probability that an individual is 
included among the group selected as parents 
in the next generation (Gimelfarb, 1989). 



The potential utility of MAS in practical 
plant breeding programs is limited by: 1) the 
number of molecular marker loci required to 
detect all significant linkage associations; 2) 
population sizes required to detect QTLs for 
traits with low heritability; 3) the sampling 
errors associated with the weighting of indices 
when combining molecular marker loci; and 
4) phenotypic information and the cost per 
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unit information gained (Edwards and Page, 
1994; Lande and Thompson, 1990). Marker 
effectiveness (i.e., selection response) increases 
as the number of QTLs affecting a trait de- 
creases (Edwards and Page, 1994). The effec- 
tiveness of MAS decreases as the linkage 
distance between markers and QTLs increases. 
Moreover, greater genetic gain can be made 
when flanking QTLs between two marker loci 
are used as compared to single markers, if 
single markers are loosely linked to a QTL. 
However, the use of flanking markers requires 
the characterization of twice as many markers 
as compared to selection using single markers. 
Thus, where dense maps are available, the 
value (cost/unit information) of flanking mark- 
ers decreases as marker QTL associations in- 
crease. 

The effectiveness of MAS is also deter- 
mined by the relative linkage disequilibria 
between the marker loci and QTLs that condi- 
tion trait expression (Lande and Thompson, 
1990). Linkage disequilibria (between genetic 
markers and QTLs) is maximized by the mat- 
ing germplasm of divergent origin. Such 
matings occur regularly during plant improve- 
ment (e.g., crossing between elite inbreds to 
begin genetic recombination for line develop- 
ment). Fixation of desirable trait loci in an elite 
background is the goal of breeding programs. 
Greater genetic gain is likely when fewer 
genes are involved in trait behavior because 
less recombination occurs (Edwards and Page, 
1994). Having many QTLs exacerbates the 
problem of marker-QTL recombination, and 
thus the time required for fixation increases as 
the number of QTLs associated with a particu- 
lar trait increases. 

The multiple regression of phenotype on 
genetic markers can be used during MAS to 
provide a tactical assessment of gain from 
selection. Analysis of such relationships dur- 
ing MAS capitalizes on the linkage disequilib- 
rium generated by the original mating of two 
inbred lines (Gimelfarb and Lande, 1994a). 
MAS can, therefore, be very effective during 
the early generations of population improve- 
ment where important linkages have not been 
eroded by recombination (Edwards and Page 
1994; Lande and Thompson, 1990). Markers 
that contribute significantly to selection in 
initial generations should be re-evaluated each 
generation to determine their continued effec- 
tiveness (Gimelfarb and Lande, 1994a). This 
need can be costly and undermine the potential 
usefulness of MAS. 

If a significant amount of the additive vari- 
ance associated with a QTL can be accounted 
for by molecular markers, then MAS can in- 
crease breeding efficiency (Edwards and Page, 
1994; Gimelfarb and Lande, 1994b). Likewise, 
the effectiveness of marker loci will be in- 
creased as the number of individuals in a 
population is increased, since a greater pro- 
portion of the additive genetic variance can 
potentially be explained. When trait heritabil- 
ity is low, population size must be relatively 
large (100-1000 individuals) to include unre- 
lated individuals that detect additive variance 
associated with marker loci. Moreover, full- 
or half-sib populations are likely to require 



larger samples, depending on the degree of 
dominance associated with the trait. 

Sample size is important when considering 
the potential loss of efficiency in MAS during 
protracted index selection (Gimelfarb and 
Lande, 1994a). Sampling errors can occur 
during model building as the relative weights 
of molecular and phenotypic information are 
estimated. MAS may only be cost efficient if 
phenotypic selection is made difficult due to 
large environmental effects and/or the number 
of loci affecting such traits is large (Lande and 
Thompson, 1990). However, increasing the 
number of markers that contribute to the selec- 
tion index does not necessarily increase the 
effectiveness of MAS. The use of many mark- 
ers may in fact result in a weaker response to 
selection. 

Application 

Simmonds (1979) has stated that "...plant 
breeding often not only generates benefits but 
is also attractive in having relatively low imple- 
mentation costs..." This statement may no 
longer apply if MAS is rigorously applied to 
crop improvement. The costs of MAS can be 
high when compared to classical phenotypic 
selection, and the cost : benefit ratio may not 
be high enough to warrant use of MAS. The 
cost-benefit relationship can now be more 
critically evaluated in differing marker sys- 
tems (Table 1). For instance, Lande (1992) 
indicated that the cost of scoring RFLPs is on 
the order of 100 to 1000 times as expensive as 
measuring standard phenotypes in most crops. 
Simulation experiments by Ragot and 
Hoisington (1993) indicated that the costs for 
employing RAPDs may be higher than for 
RFLPs as the number of individuals and mark- 
ers in an experiment are increased. While 
RAPDs were found to be most cost/time effi- 
cient when sample sizes were small, RFLPs 
were more advantageous as sample sizes were 
increased. Darvasi and Soller (1994) consid- 
ered experimental costs (i.e., number of indi- 
viduals and marker spacing) that might be 
incurred during QTL analysis of a plant spe- 
cies possessing a genome size of 1000 cM. 
They concluded that the costs of MAS would 
be prohibitive since hundreds of individuals 
would be needed in a typical marker-QTL 
experiment, even if all possible markers could 
be used and the power of QTL detection (i.e., 
LOD) was high. 

Although the costs associated with MAS 
are currently high, it has been shown to have 
potential utility for managing complex traits 
(Table 3). For instance, Stuber and Edwards 
(1986) recorded similar genetic gains when 
using either phenotypic selection or MAS 
(isozymes) for quantitative traits in maize. 
Similar results were found by Stromberg (as 
cited by Dudley, 1 993). More recently, Stuber 
(1995) observed that significant genetic gain 
could be obtained during marker-assisted back- 
crossing in maize aimed at transferring tar- 
geted QTLs from Tx303 into B73 and from 
Oh43 into Mol7. Although the yield of hy- 
brids between the "enhanced" B73 and Mo 17 
exceeded that of control hybrids by more than 



15%, no parallel assessment of the relative 
efficiency of classical breeding was included. 
Edwards and Johnson (1994) used MAS in 
two sweet corn populations (A and B) and 
found a positive response from selection (two 
cycles per year) for several traits. Elite lines 
were crossed to produce F 4 lines that were then 
subjected to either MAS (RFLPs) or pheno- 
typic recurrent selection (PRS) for yield and 
quality traits. These original parents and the 
resultant populations (MAS and PSR) were 
crossed to two testers for replicated compari- 
son at one location in one year. Although 
positive response to selection was observed 
for six of 11 traits in one population (A), 
differences between the average performances 
of the hybrids developed from MAS lines and 
parental hybrids were not significant. Like- 
wise, the overall response to phenotypic or 
MAS selection was similar in a second popu- 
lation (B). The authors hypothesized that this 
lack of response to MAS was due to antagonis- 
tic effects of genome regions responsible for 
the yield and quality traits observed. 

SUMMARY 

Molecular markers and associated tech- 
nologies can assist in map construction and the 
analysis of the molecular and genetic basis of 
quantitative and qualitative traits. Molecular 
markers that are tightly linked to economi- 
cally important traits that are under the control 
of single genes have potential for immediate 
utility in plant improvement programs. Per- 
haps the most optimistic prospects for MAS is 
in disease resistance breeding, especially where 
several genes control resistance with complex 
interactions and where pyramiding of genes is 
desirable (Kelly, 1995; Schafer and Roelfs, 
1985; Slavery et al., 1989). 

Optimal use of QTL regions will, however, 
require a knowledge of their often complex 
epistatic interactions. Moreover, although 
MAS may have potential in population and 
inbred line development, it will likely have 
little or no effect in reducing the need for 
replicated field trials and testing. Optimal 
methods for mapping QTLs are still being 
debated and more sophisticated computer- 
aided analysis procedures are being devel- 
oped. When QTLs and single genes are ad- 
equately mapped, they can be isolated bio- 
chemically (Tanskley et al., 1995; Young, 
1990). Methods for their isolation (i.e., clon- 
ing) and characterization are also points of 
considerable discussion. 

The effectiveness of any MAS procedure 
will depend on the accuracy of the phenotypic 
classification of trait expression and the de- 
gree of linkage between a marker(s) and traits 
of interest. Although MAS may increase gain 
from selection when compared to phenotypic 
selection, marker utility in plant improvement 
programs will ultimately be determined by 
cost/unit information (Edwards and Page, 
1994). Clearly, laboratory costs associated 
with MAS applications are decreasing, and 
more effective and efficient molecular mark- 
ers are being developed (Gu et al., 1995). This 
progress will make MAS more attractive and 
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will foster its prudent implementation as a tool 
for plant breeding. As a result of such changes, 
MAS might have potential for selection of 
characters such as yield components in agro- 
nomic and horticultural crops. Nevertheless, 
in horticultural crops, where many complex 
and highly integrated aesthetic, culinary, and 
organoleptic attributes are considered neces- 
sary for market acceptance, plant breeding 
expertise and decision making ability will 
clearly remain pivotal for genetic improve- 
ment. 
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Introduction 

In recent years great efforts have been put on the identification of highly informative molecular markers 
for screening diversity in forest tree species. The identification of efficient methods, in terms of costs 
and time, for accurate analyses of patterns of diversity in forest trees is extremely relevant, because these 
organisms typically show a high level of variability, so that sampling of a large number of populations 
and individuals is required for each study. Microsatellites are tandem repeats characterised by short 
motifs (1 to 6 bp), a low degree of repetition (5 to 100 repeat units) and a randomly dispersed 

distribution of about 10 4 to 10 5 microsatellite regions per genome (Tautz, 1993). They display a 
presumably selectively neutral behaviour, show a co-dominant inheritance that allows discrimination of 
homo- and heterozygotic states, and occur frequently and evenly distributed throughout the genome. 
Their high degree of length polymorphism, which is due to different numbers of repeats within the 
microsatellite regions, can be easily and reproducibly detected via the polymerase chain reaction (PCR). 
Their main applications are in genome mapping and in population analysis, but microsatellites are also 
useful for taxonomy, parentage analysis, identification of individuals in forensic studies, and human 
cancer diagnostics. 

Microsatellites are not limited to the nuclear genome. They occur in chloroplasts and also in the 
mitochondrial genome, as found by Soranzo et al (1999) as a repetition of G/C. This study reviews 
work that was done in the field of chloroplast microsatellites in conifer species. 

Chloroplast microsatellites (cpSSR) 

The availability of the entire chloroplast sequence of the Japanese pine species Pinus thunbergii 
(Wakasugi et al 9 1994) allowed the identification of cpSSR (chloroplast single sequence repeats). These 
microsatellites consist of repetitions of a single nucleotide (19 A/T and 1 G/C) (Powell et al 9 1995; 
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Vendramin et ai 9 1996). Primers for the amplification of chloroplast microsatellites were also recently 
designed for angiosperms (Weising and Gardner, 1999). 

Considering that the chloroplast genome does not recombine due to its paternal inheritance in conifers 
(e.g. for cp-microsatellites: Cato and Richardson, 1996 ; Vendramin and Ziegenhagen, 1997; Sperisen et 
al 9 1998) and maternal inheritance in angiosperms (e.g. for PCR/RFLP polymorphism in oaks: Dumolin 
et aL, 1995), cpSSR variants accumulate in a uniparental chloroplast lineage and can thus provide 
information about the history of populations. Microsatellite variants are supposed to be generated in a 
stepwise manner by addition or deletion of single repeat units. Under such a stepwise mutation model 
(Valdes et al, 1993), microsatellite variants with small repeat length differences are more closely related 
than alleles with larger length differences, and consequently it can be considered that the process of 
mutation has a "memory" (Jarne and Lagoda, 1996). Computer simulations have produced linear 
relationships between genetic distances based on the size differences of the SSR alleles and the time of 
divergence (Slatkin, 1995; Goldstein et al, 1995). 

Methods for the detection of chloroplast microsatellite polymorphism 

Standard methods were optimised for the characterisation and the screening of chloroplast 
microsatellites in conifers. The main steps of the procedures for the amplification and detection of length 
polymorphism can be summarised as follows (for details see Vendramin et ah, 1996). 

PCR conditions 

PCR amplifications were carried out using a Perkin Elmer model 9600 thermal cycler in a total volume 
of 25 |^1 containing 0.2 mM of each dNTPs, 2.5 mM of MgCl 2 , 0.2 \xM of each primer, lOx reaction 

buffer (Pharmacia), 25 ng of template DNA and 1 unit of Pharmacia Taq polymerase, with the following 
profile: 5 min denaturation at 95°C, 5 min at 80°C enzyme addition, followed by 25 cycles of 1 min 
denaturation at 94°C, 1 min annealing at 55°C and 1 min extension at 72°C, with a final extension step 
of 72°C for 8 min. Amplification reactions were automatically prepared using a robotic workstation 
Biomek 2000 (Beckman Instruments). One of the two PCR primers in each reaction was 5 ! fluorescine- 
labelled. 

Sizing and sequencing of amplification products 

Several pairs of primers for the amplification of chloroplast microsatellites were designed in order to 
obtain fragments having different size ranges, thus allowing multiplexing by size range. Two or three 
microsatellite-producing fragments in different size ranges were simultaneously loaded together with 
internal molecular weight standards on a 6%, 20 cm long denaturating 7M urea, 0.6x Tris-Borate-EDTA 
polyacrilamide gel (Pharmacia) and run on an ALF automatic sequencer (Pharmacia) at 35 Watt 
constant power for approximately 80 minutes. The same gel was loaded twice. External molecular 
weight standards as well as internal standards (50, 100, 150 and 200 bp) were used in conjunction with 
the Fragment Manager version 1.2 conversion software (Pharmacia) to size the amplified fragments. An 
example of sizing of chloroplast microsatellites using the automatic ALF Pharmacia sequencer is 
reported in Figure 1 . 

Selected polymorphic amplification products were sequenced in order to confirm the presence of the 
microsatellite regions in the amplified fragments as well as to verify that length variation was due to a 
different number of repeats within the microsatellite regions and not to mutations in the flanking 
regions. For this purpose amplified fragments were cloned into a PCR II plasmid vector (Invitrogen) and 
then sequenced using the ALF automatic sequencer (Pharmacia). The clones were sequenced from both 
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ends, using M-13 universal fluorescent-labelled primers and T7 DNA polymerase (Pharmacia). The 
sequences were run on 6% polyacrilamide, 7M urea, 0.6x Tris-Borate-EDTA gels at 1500V, 38mA and 
34 Watt for 3 hours. Two clones from each cloning experiment were sequenced. An example of the 
sequencing of a chloroplast microsatellite region in Abies alba is reported in Figure 2. 

Molecular organisation of chloroplast microsatellites in two conifer species 

Abies alba Mill, (silver fir) 

Sequencing was done for those amplification products revealing polymorphism between parents 
involved in controlled crosses and exhibiting sufficient polymorphism (Pt 30204 and Pt 71936, codes as 
in Vendramin et aL, 1996). Figure 2 a gives the sequences for microsatellite locus Pt 30204 of three A. 
alba individuals (A,B,C) and one A. pinsapo individual. Figure 2 b depicts the sequences for 
microsatellite locus Pt 71936 of A. alba individual C and the A. pinsapo individual. Moreover, in 
Figures 2 a and b, alignment of sequences of the two microsatellite loci was done for the parental Abies 
individuals and Pinus thunbergii. 

Figures 2 a and b clearly reveal that the amplified loci contain simple sequence repeats and that their 
length polymorphism is due to a variable number of repeat units, thus confirming these loci to be 
chloroplast microsatellites. In the completely sequenced chloroplast genome of Pinus thunbergii (see 
above) the microsatellite locus Pt 30204 is characterised by a jointly occurring C and T mononucleotide 
stretch (C) 10 x (T) 12 . In the investigated individuals of Abies this locus turns out to be composed of 

variable numbers of three mononucleotide repeats ( Figure 2 a). Microsatellite locus Pt 71936 contains a 
mononucleotide repeat characterised by a variable number of T for all three species Abies alba, A. 
pinsapo and P. thunbergii ( Figure 2 b). 

For locus Pt 30204, alignment of sequences exhibits a striking heterogeneity of molecular organisation 
within the species A. alba ( Figure 2 a). Individual A is different from conspecific individuals B and C by 
insertion/deletions within the mononucleotide stretches, an addition of cytosine repeat units, and also by 
heterologous sequences of the interspersed non-coding sequences. By this organisation, the A. alba 
individual resembles more that of P. thunbergii (93% homology) than that of the conspecific individuals 
B and C (e.g. 79% homology to B), while B and C only differ by an insertion/deletion of 1 adenosine in 
the repetitive stretch. Also striking is the finding that for this microsatellite locus the A. pinsapo 
individual is more homologous to the A, abies individuals B and C than is A to either B or C. 

The molecular organisation of SSR locus Pt 71936 seems to be more conserved throughout all 
sequences ( Figure 2 b). Size variations are more or less restricted to variable numbers of repeat units. 

Sequence analysis of the microsatellite Pt30205 revealed that in many cases individuals sharing the 
same size are also identical in sequence. Nevertheless, individuals carrying the same size variants are 
not always characterised by the same microsatellite sequence, as was discovered by analysing 
individuals belonging to different populations as well as individuals of the same population. This 
indicates that size variation may underestimate the underlying sequence variation (Ziegenhagen et al 9 in 
preparation). 

Picea abies (L.) Karst. (Norway spruce) 

Sequence analysis of amplified fragments revealed the occurrence of microsatellites and showed that 
size variation at the same microsatellite locus was due to variation in the copy number of SSRs 
(Sperisen et ai 9 1998). The three microsatellites (Pt 26081, Pt 63718 and Pt 71936, codes as in 
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Vendramin et al, 1996) consisted of A/T mononucleotide repeats. Alignment of the Pinus thunbergii 
and Norway spruce sequences showed a very high degree of homology. Size variation at chloroplast 
microsatellite loci Pt 71936 and Pt 63718 was restricted to differences in the copy number of SSRs, with 
the flanking sequences being identical. The sequences at microsatellites locus Pt 26081 showed 
variation in the copy number of SSRs and three insertions/deletions and a nucleotide substitution in the 
regions flanking the SSRs. Moreover, microsatellite locus Pt 71936 of Norway spruce revealed an 
organisation similar to that of the same microsatellite in silver fir. Both sequences contained a SSR and 
differed by one insertion/deletion and four nucleotide substitutions in the regions flanking the SSRs. 

Inheritance of chloroplast microsatellites 

Inheritance of chloroplast microsatellites was tested in Norway spruce (Picea abies K.) by Sperisen et 
al (1998) and in silver fir {Abies alba Mill.) by Vendramin and Ziegenhagen (1997). 

The mode of inheritance of the chloroplast microsatellites in Norway spruce was analysed in the 
progenies of seven controlled crosses including a reciprocal cross (Sperisen et al, 1998). The progenies 
of all crosses exclusively showed the size variant found in the male parent, thus indicating that the three 
chloroplast microsatellites are paternally inherited ( Table 1 ). The absence of the size variant of the 
female parent in the embryos appeared to exclude the occurrence of heteroplasmy as a result of maternal 
leakage. 



SSR* 


Cross 


Female parent 
size variants (bp) 


Male parent 
size variants (bp) 


Size variants (bp) 
of progenies (N) 


Pt 26081 


1895 x'5444 


116 


115 


115(24) 




5443 x 39 


115 


116 


116(24) 




5460 x 2037 


116 


111 


111 (24) 


Pt63718 


1641 x 5468 


99 


102 


102 (24) 




1895 x5444 


100 


96 


96 (24) 




5443 x 39 


100 


96 


96 (24) 




reciprocal cross 


96 


100 


100 (24) 




reciprocal cross 


100 


96 


96 (24) 




5451 x 5444 


100 


96 


96 (24) 


Pt 71936 


1895x5444 


145 


146 


146 (24) 



Table 1: Transmission of chloroplast microsatellites in Norway spruce as studied in intraspecific crosses 
(from Sperisen et al, 1999). bp = base pairs; N = number of seeds analysed; * codes refer to Vendramin 
etal (1996). 



Using primer pairs derived from chloroplast simple sequence repeats of Pinus thunbergii, two 
polymorphic SSR loci were identified and sequence-characterised in the genus Abies (Vendramin and 
Ziegenhagen, 1997). PCR products exhibited considerable length variation among six different Abies 
species and within A alba. A total of 75 Fl progeny of both an interspecific and an intraspecific 
reciprocal cross confirmed the two SSRs to be stably inherited and to follow predominantly paternal 
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inheritance ( Table 2 ), When in addition to the embryo of each seed also the primary haploid endosperm 
(megagametophyte) was analysed, the size variant of the seed parent predominantly occurred, thus 
giving evidence for the elimination of the maternal plastid only in the egg cell or in the pro-embryos. In 
Table 3 , results are given for PCR amplification of genomic DNA from nine Abies individuals from six 
different species. From the 1 1 tested primer pairs, two generated amplification products (Pt 30204 and 
Pt 71936) which showed considerable length variation among A. alba individuals but also among the six 
investigated Abies species ( Table 3 . shaded rows). Seven size variants of PCR products were observed 
when using primer pair Pt 30204 and eight size variants when using Pt 71936. One primer pair (Pt 
1 5 169) exhibited size variation only for the A. cephalonica individual. Six primer pairs produced 
amplification products, each of identical length for all individuals under study. One primer pair does 
very poor amplification, and the remaining pair failed to generate any amplification product at all. 



Paternal inheritance of chloroplast microsatellites was also tested and confirmed in interspecific crosses 
between Pinus halepensis Mill, and Pinus brutia Ten. (Anzidei et a/., in preparation). 



SSR 


Maternal 
parent 


Paternal 
parent 


Seedling 
progeny 


Seed progeny 


Embryos 


Megagametophytes 


Size variant 


Size variant 


Size variant 
(n of N) 


Size variant 
(n of N) 


Size variant 
(nofN) 


Pt 71936 
Pt 30204 


A. alba C 
147 
136 


A. pinsapo 
156 
139 


156 (6 of 6) 
139(6 of 6) 






Pt 
30204 


A. alba A 
147 


A. alba B 
37 




137 (30 of 30) 


47 (18 of 18) and 
137(2 of 18) 


Pt 30204 


A. alba B 
37 


A. alba A 
47 




147 (33 of 33) 

and 
137(1 of 33) 


137(23 of 23) 



Table 2: Transmission of cp microsatellites (Pt 30204 and Pt 71936) in Abies as studied in an 
interspecific and in an intraspecific reciprocal cross (from Vendramin and Ziegenhagen, 1997). n = 
number of observed size variants, N = number of investigated individuals, embryos or 
megagametophytes. 



Code* 


Primer sequences 5' - 3' 
sense 
antisense 


Abies individuals No. 1-9 
size of PCR products [bp] 


1 || 2 || 3 || 4 || 5 || 6 || 7 || 8 || 9 


Pt 9383 


AGA ATA AAC TGA CGT AGA TGC CA 
AAT TTT CAA TTC CTT TCT TTC TCC 


no amplification 


Pt 

15169 


CTT GGA TGG AAT AGC AGC C 
GGA AGG GCA TTA AGG TCA TTA 


131 


131 


131 


131 


131 


131 


131 


131 


130 


Pt 

26081 


CCC GTA TCC AGA TAT ACT TCC A 
TGG TTT GAT TCA TTC GTT CAT 


112 


112 


112 


112 


112 


112 


112 


112 


112 
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Pt 

30204 


TCA TAG CGG AAG ATC CTC TTT 
CGG ATT GAT CCT AAC CAT ACC 


147 


137 


136 


149 


140 


149 


139 


149 


148 


Pt 

36480 


TTT TGG CTT ACA AAA TAA AAG AGG 
AAA TTC CTA AAG AAG GAA GAG CA 


149 


149 


149 


149 


149 


149 


149 


149 


149 


Pt 

41093 


TCC CGA AAA TAC TAA AAA AGC A 
CTC ATT GTT GAA CTC ATC GAG A 

x V ' 1 X X X VJ J. A VJ Ail A V^/ ix. X VJ /x\J L X 


poor amplification 


Pt 

48210 


CGA GAT TGA TCC GAT ACC AG 
GAG AGA ACT CTC GAA TTT TTC G 

/ X VJ 1 X Vj X X X X V-/ X V^*> X V ' VJi Xx X X X X XX V_J 


89 


89 


89 


89 


89 


89 


89 


89 


89 


Pt 

71936 


TTC ATT GGA AAT ACA CTA GCC C 
AAA ACC GTA CAT GAG ATT CCC 


149 


149 


147 


148 


151 


150 


156 


152 


150 


Pt 

79951 


CTT TTG TTT TTC AAC AAT TGC A 
ACA TCT ATC TCC CAT ATC GGC 


138 


138 


138 


138 


138 


138 


138 


138 


138 


Pt 

87268 


GCC AGG GAA AAT CGT AGG 
AGA CGA TTA GAC ATC CAA CCC 


171 


171 


171 


171 


171 


171 


171 


171 


171 


Pt 

110048 


TAA GGG GAC TAG AGC AGG CTA 
TTC GAT ATT GAA CCT TGG ACA 


90 


90 


90 


90 


90 


90 


90 


90 


90 



Table 3: Size of PCR products obtained amplifying genomic DNA of six Abies species with 1 1 primer 
pairs matching flanking regions of simple sequence repeats (SSRs) in cpDNA of Pinus thunbergii 
(Wakasugi et al 1994) (from Vendramin and Ziegenhagen, 1997). * denotes the position of the 5' base 
of sense primer in the published P. thunbergii cpDNA sequence. Individuals No. 1 : A. alba A; 2: A. alba 
B; 3: A. alba C; 4: A. nordmanniana^ 5: A. cilica; 6: A. numidica; 7: A. pinsapo\ 8: A. nordmanniana; 9: 
A. cephalonica 



Universality of chloroplast microsatellite markers 

The identification of microsatellite regions is a very expensive and time-consuming process, which 
generally requires the construction and screening of a genomic library (efficient protocols for the 
enrichment in microsatellites are available, e.g. Edwards et al, 1996). Therefore great efforts are 
necessary for sequencing and for the optimisation of the markers. Generally not more than 25% of the 
identified microsatellites are single locus Mendelian markers. A possible strategy to try to increase the 
efficiency of the identification of microsatellite regions, besides the construction of enriched libraries, 
may be represented by the possibility to transfer microsatellite markers developed in one species to 
others in order to reduce the costs of their development. 

In the case of nuclear microsatellites, this strategy has been proved to be not very efficient. Echt et al 
(1998), for example, using SSR primer pairs from Pinus strobus and Pinus radiata, found that while 
primers for monomorphic loci could amplify loci from a wide range of species, the primers for 
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informative dinucleotide repeat loci could only amplify loci from species within the same subgenus. 
Primers for the amplification of nuclear microsatellites in Pinus halepensis also work in the closely 
related species Pinus brutia (Keys et al, 1999). Twenty-five primer pairs developed for Pinus 
halepensis were also tested in Pinus pinaster, but only one of them produced polymorphic amplification 
products that showed only a single band per haploid genome; the remaining 24 pairs produced either no 
amplification product, single monomorphic bands, or multiband patterns (Mariette et al, 1999). 

For the chloroplast genome, on the contrary, the high level of DNA sequence conservation, including 
the arrangements of genes and intergenic sequences in conifers as well as in angiosperms, confer to the 
cpSSR markers a very high degree of "universality". Thus, primers designed on the basis of the 
sequences of the chloroplast genome of Pinus thunbergii also worked in many other conifers (Powell et 
ah, 1995; Vendramin et al, 1996) as well as in angiosperms (Cato and Richardson, 1996). The high 
degree of conservation of sequences in the chloroplast genome of conifers was also confirmed by studies 
performed by Vendramin et ah (1996) in Pinus leucodermis, Vendramin and Ziegenhagen (1997) in 
Abies alba and Sperisen et al (1998) in Picea abies. These primers have been used with success in 1 10 
different conifer species belonging to different taxonomic classifications, in particular to the Pinaceae, 
Cupressaceae and Taxodiaceae (Vendramin et al , unpublished data), thus dramatically reducing the 
cost of development of these markers. Sequencing data in Picea abies, Abies alba, Pinus halepensis, 
Pinus brutia, Pinus pinaster, Pinus pinea, and Pinus cembra always confirmed the presence of the 
microsatellite region in the amplified fragments. Analyses are also in progress to verify the presence of 
cpSSRs in the amplified fragments of Cupressus sempervirens and Taxus baccata. The detection of a 
typical lbp variation of the amplified fragments of different individuals of Cupressus sempervirens 
seems to be evidence of the presence of cpSSR regions also in the Cupressaceae. 

Equipment and cost 

In principle, only basic molecular biology facilities are necessary for the analysis of chloroplast 
microsatellite polymorphism, such as a PCR thermal cycler and a system for vertical gel electrophoresis. 
However, considering that chloroplast microsatellites are repetitions of a single nucleotide and that 
therefore single base length variation must be detected, a system with high resolution (sequencing gel 
apparatus) should be available. High accuracy for sizing the amplified fragments can be obtained using 
internal and external standards of known molecular weight (see Figure 1 ). 

The efficiency of the screening can, however, be increased considerably through the automation of the 
procedures: multiplex of PCR reaction and/or multiple loading of a single gel (at least twice) as well as 
the use of automated DNA sequencing apparatus (which does not require radioactive labelling) with 
appropriate software can speed up and automate the genotyping to a considerable extent. The use of a 
robotic workstation equipped with a multi-channel pipettor allows the preparation of 96 amplification 
reactions in about 20 minutes. The simultaneous use of a Beckman 2000 robotic workstation and of an 
ALF automatic sequencer with the software Fragments Manager version 1 .2 allows the amplification 
and sizing of 300 to 500 samples with three chloroplast microsatellites per day per operator. 

Case studies 

Geographic distribution of the diversity 

Picea abies 

Three chloroplast microsatellites (cpSSRs), previously sequence-characterised and for which paternal 
inheritance was tested and confirmed (Sperisen et al, 1998), were used to assess their usefulness as 
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informative markers for phylogeographic studies in Norway spruce (Picea abies) and to detect spatial 
genetic differentiation related to the possible recolonization processes in the post-glacial period 
(Vendramin et al 9 2000). Ninety-seven populations were included in the survey. Some 8, 7 and 6 
different size variants for the three cpSSRs, respectively, were scored by analysing 1,105 individuals. 
The above 21 variants combined into 41 different haplotypes. The distribution of some haplotypes 
showed a clear geographic structure and seems to be related to the existence of different refugia during 
the last glacial period ( Figure 3 ). The haplotype 03 {116/96/144 as variant size at the three loci analysed, 
respectively) was present only in Scandinavia and northeastern and southeastern Europe, while the 
haplotype 04 (116/100/143) was detected only in the western part of the natural range (Central Europe 
and the Alps - Figure 3 ). The analysis of chloroplast SSR variation revealed the presence of two main 
gene pools (Sarmathic-Baltic and Alpine-Centre European) and a relatively low degree of differentiation 
(R st (Slatkin, 1995) of about 10%), characteristic of tree species with large distribution and probably 

influenced by intensive human impact on this species. No evidence for the existence of additional gene 
pools (e.g. from Balkan and Carpathian glacial refugia) were obtained, though the existence of genetic 
discontinuity within the species' European range was observed ( Figure 4 ). 

Geostatistics was applied to the chloroplast simple sequence repeats (cpSSR) haplotype-frequency data 
from the 95 European Norway spruce populations (Bucci and Vendramin, 2000) to provide preliminary 
evidence about the following issues: 1) delineation of genetically homogeneous regions ('genetic 
zones'); 2) prediction of their haplotype frequencies and definition of related criteria to be applied for 
provenance identification and certification of seedlots; 3) construction of a continental-scale 'availability 
map' of the intraspecific biodiversity for Norway spruce. Direct evidence for large-scale geographic 
structure over the European natural range was obtained, detecting both geographic clines and stationary 
patterns. The increase of the mean genetic divergence by geographic distances (up to about 1,800 km) 
provided a strong hint that geographic distance is a major factor for differentiation in Norway spruce. 
Haplotype-frequency surfaces were obtained by applying ordinary kriging to sampling frequency data. 
Cluster analysis carried out on haplotype-frequency surfaces revealed a fair discrimination among 16 
genetic zones ( Figure 5 ). Dendrogram analysis carried out on the predicted mean haplotype frequency 
confirmed a fairly good separation of the detected genetic zones ( Figure 6 ). 

Application of geostatistical analysis to the large amount of genetic data is a promising tool for the 
analysis of complex geographic patterns aimed to reconcile appropriate conservation strategies and 
breeding exploitation of genetic resources. 

Abies alba 

Based on two polymorphic chloroplast microsatellites that had been previously identified and sequence- 
characterised in the genus Abies (Vendramin and Ziegenhagen, 1997), genetic variation was studied in a 
total of 714 individuals from 17 European silver fir (Abies alba Mill.) populations distributed all over 
the natural range (Vendramin et al, 1999). Eight and 18 different length variants at the two loci, which 
combined into 90 different haplotypes, were detected ( Figure 7 ). Genetic distances between most 
populations as measured by d Q (Gregorius 1984) were high and significant. There is also evidence for 

spatial organisation of the distribution of haplotypes, as shown by permutation tests, which demonstrate 
that genetic distances increase with spatial distances ( Figure 8 ). Large heterogeneity in diversity across 
populations was observed, as measured by the number of haplotypes, by the unbiased index of diversity 
of Nei (1973) as well as by allelic richness. Furthermore, there is good congruence in the levels of allelic 
richness of the two loci across populations. The present organisation of levels of allelic richness across 
the range of the species is likely to have been shaped by the distribution of refugia during the last 
glaciation and the subsequent recolonization processes. Mainly, processes of genetic drift due to 
bottleneck and effects of isolation could be inferred from the obtained data. 
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Pinus halepensis (Aleppo pine) and Pinus brutia (Brutian pine) 

Nine chloroplast, paternally inherited SSR markers were used to describe genetic variation of the two 
closely related species belonging to the halepensis complex (P. halepensis Mill, and P. brutia ten.). 
They reveal a large polymorphism both within and among populations (Bucci et ai, 1998). The high 
level of among-population genetic divergence (R 9 Slatkin, 1995) found for P. halepensis and the low 

within-population haplotypic diversity (except for Greek and Southern Italian stands) (S w , Slatkin, 1995) 

are consistent with the hypothesis of a recent expansion of the species (last 10,000 years), with 
colonising populations establishing by migration of a limited number of individuals (founder effect) 
and/or population dynamics regulated by fire (population bottlenecks). Analysis of the geographic 
distribution of haplotypic diversity revealed two main groups of Aleppo pine populations: a central 
Mediterranean group, centred on southern Italy and comprising northern Italian and Spanish 
populations; and a southern Mediterranean group, centred on Greece. Almost all the haplotypic diversity 
detected in this species is concentrated in a very limited area located in Greece, which is considered to 
be the centre of origin of the species from which the recolonization in the post-glacial period started 
(Morgante et al, 1998). For P. brutia, no clear geographic structure was found, even though the degree 
of genetic differentiation was relatively high (R t of about 30%). 

Paternity analysis and detection of natural hybridisation (introgression) 

Abies alba 

Two relatively isolated adult trees about 30 m apart, as well as 24 naturally regenerated young trees in 
their direct neighbourhood, were analysed at two chloroplast microsatellite loci (Ziegenhagen et aL, 
1998). Among all adult and young trees, five different length variants were found for each of the two 
microsatellites. Observed individual combinations of the size variants of the two loci allowed the 
definition of five different haplotypes. Figure 9 schematically represents the spatial distribution of the 
two adult trees (A and B) and the surrounding young trees. Each individual is characterised by the 
length variants at both microsatellite loci and the relevant two-locus-haplotype. The results reveal that 
the two adult trees can be distinguished by both microsatellite loci. Comparison of the haplotypes of the 
adult trees and of the surrounding young trees indicates that, for six out of 16 young trees, paternity of 
either A or B can unambiguously be excluded. The study demonstrates the potential usefulness of a 
novel molecular approach towards paternity analysis in a conifer species. 

Pinus halepensis and Pinus brutia 

Three "diagnostic" markers showing size variants clearly distinguishing P. halepensis from the P. brutia 
were used to throw light on the occurrence of natural hybridisation in a Turkish sympatric population of 
P. halepensis and P. brutia (Bucci etal, 1998), Strong evidence of introgression of "halepensis" 
haplotypes into P. brutia seeds (but not vice versa) was detected. The overall hybridisation rate was 
estimated to be about 1 1 % of the total number of matings analysed ( Table 4 ). Possible explanations for 
the observed unidirectional, interspecific gene flow are: 1) differential impollination success (for 
example, in terms of pollen tube growth rate); 2) unbalanced potential pollen donors (for example, due 
to different stand density and/or differential stage of P. halepensis and P. brutia individuals); 3) embryo 
abortion of P. brutia (male) x P. halepensis (female) hybrids due to a cellular or molecular mechanism 
of incompatibility. Previously reported evidence on artificial crossings indicates that the last explanation 
can be considered a useful working hypothesis for further research. 
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No. 


Haplotype 


P. brutia seeds 


P. halepensis seeds 


Overall 


Type 


Embr 


Mega 


Alio 


Embr 


Mega 


Alio 


i 

i 


107 170 76 


0 


2 


0 


0 


0 


0 


2 


brutiu 


9 

z. 


MR 1 1Q 76 


0 


6 


10 


0 


0 


0 


16 


hvutio. 




MR 1 70 76 

1 UO 1 ZU / V 


23 


25 


15 


0 


0 


0 


63 


hvutin 


A 


10R 110 77 

1 UO 1 ZU / / 


3 


0 


6 


0 


0 


0 


Q 

y 


hvutin 

LSI lAl IL* 


c 


1 0R 171 76 
1 UO 1 Z 1 / 0 


2 


1 


0 


0 


0 


0 


% 
j 


hvutin 


z: 
0 


1 UO 1 ZZ / 0 


0 


1 


0 


0 


0 


0 


1 

i 


hvutin 
Ut unci 


7 


100 1 70 76 
i uy i zu / u 


0 


1 


0 


0 


0 


0 


1 


hvutin 

LSI Inrtr H* 


Q 
o 


1 17 1 Id R7 

I 1 Z 1 In- OZ 


8 


0 


0 


34 


30 


2 


74 


haleTjevisis 


9 


112115 82 


0 


o 


o 


1 


4 


37 


42 


hctleDensis 


10 


113 114 82 


0 


0 


0 


1 


2 


2 


5 


halepensis 


Sample size 


36 


36 


72 


36 


36 


127 


343 




No. Haplotypes 


5 


6 


12 


3 


3 


15 


44 





Table 4: Analysis of the occurrence of natural hybridization in a sympatric populations (Kirankoy - 
Turkey) of P. halepensis and P. brutia. Three ^diagnostic' cpSSR marker loci (Pt26081, Pt36480, and 
Pt41093) showing size variants clearly distinguishing P. halepensis from P. brutia were used. 'Embr 1 
and 'Mega' refer to embryos and megagametophytes collected from trees of the sympatric population of 
the two species, "Alio 1 refers to embryos collected from trees of the allopatric populations of the two 
species. Type 1 refers to the classification of the haplotypes based on the similarity with those detected in 
allopatric populations. Haplotypes for the same three loci detected in allopatric stands but not in the 
sympatric population are not reported. Overall sample size and number of haplotypes detected (last two 
rows) include also the individuals in allopatry not displayed here (from Bucci et al, 1999). 



Conclusions 

Chloroplast microsatellite analysis represents an extremely useful and informative approach for studying 
population history, for monitoring gene flow and hybridisation and for identifying areas harbouring high 
levels of genetic diversity in conifer species. For all the species studied so far, these markers showed a 
high degree of length polymorphism, both within and among populations. As far as allozyme data exist 
for comparison and fixation within populations is measured by G st > fixation at cpSSR marker loci is 

significantly higher than at the allozyme loci. In comparison to species with maternally inherited 
chloroplast DNA polymorphism, however, G st values for paternally inherited cpSSRs in conifers are 

lower. Long-range pollen dispersal as well as different mutation rates may be argued for that. This 
makes paternally inherited highly polymorphic cpSSRs in conifers less suitable for tracing long-range 
recolonization routes for intraspecific spatial phylogeny. In contrast they are the most advantageous 
markers for tracing past genetic process like drift or isolation due to the combined property of the high 
degree of polymorphism and uniparental inheritance, the latter marking one-half the size of 
reproductively effective population size compared to the parentally inherited markers. Moreover they 
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are characterised by a high degree of "universality", thus allowing the transfer of primers from one 
species to others belonging to different taxonomic classifications and therefore obviating the need and 
expense to develop primers independently for each species. Limited technological investments are 
required for the analysis of chloroplast microsatellite variation. Moreover, primers and methods 
developed in one lab can be easily transferred to other laboratories, with a high degree of 
reproducibility. The approach and the method of detection can also be automated to a large extent, 
increasing dramatically its efficiency and allowing to obtain a large set of data in a relatively short 
period of time. The availability of a large set of data represents an important requisite for the 
development of maps of the distribution of genetic resources of forest tree species: distribution maps are 
useful tools for an appropriate definition of programmes for the conservation of biodiversity. 
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Figure 1: Chloroplast SSR fragments displayed of the Fragment Manager software on a Pharmacia 
A.L.F. automated sequencer. PCR reactions were performed as described in the text. The Fragment 
manager vs. 1.2 conversion software was used to detect and size the peaks of fluorescence. Lanes 4 and 
20 contain a ladder that is used as external standard. Lanes 1, 2, 3, 5-19: amplifications of three 
chloroplast microsatellites of Picea abies individuals. Fragment sizes are displayed above the 



http://webdoc.gwdg.de/ebook/y/1999/whichmarker/m09/Chap9.htm 



11/18/2003 



EU-Compendium: Which DNA Marker for Which Purpose? Chapter 10 Page 14 of 22 



corresponding band. The 50- and 196-bp constant bands in all the lanes are fragments of known size 
used as internal standards and loaded together with the samples. 




Figure 2: Alignment of cpDNA sequences. A Locus Pt 30204, B Locus Pt 71936. 

Ap (Abies pinsapd), AaA (Abies alba A), AaB (Abies alba B), AaC (Abies alba C), and Pt (Pinus 
thunbergii). Types in italics are sequences of the primers, lowercase types indicate nucleotide 
substitutions and dashes stand for deletions. Coloured sequences indicate the microsatellite stretches. 

a Vendramin and Ziegenhagen, unpublished, b Vendramin and Ziegenhagen, unpublished, c GenBank 
accession No. U82922, d Vendramin and Ziegenhagen, unpublished e DDBV accession No. Dl 1467, 1 



http://webdoc.gwdg.de/ebook/y/1999/whichmarker/m09/Chap9.htm 



11/18/2003 



EU-Compendium: Which DNA Marker for Which Purpose? Chapter 10 



Page 15 of 22 



GenBank accession No. U2923, 8 GenBank accession No. U2924 (from Vendramin and Ziegenhagen, 
1997). 
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Figure 3: Geographic distribution of cpSSR haplotypes 116/96/144 (Haplo03) and 116/100/143 
(Haplo04, allele size at the three loci Pt 26081 , Pt 63718, and Pt 71936, respectively) across the Norway 
spruce natural range. Circles representing the populations are proportional to the relative within-stand 
haplotype frequency (from Vendramin et al, 2000). 
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Figure 4: Plot of the first two components of the standardized PCA carried out on transformed 
haplotype frequencies. Stands belonging to the same geographic area are shown by the same symbol. 
Despite the large genetic noise and the continuous variation across the range, a fairly good separation 
between two main group of populations ("Sarmathic-Baltic 1 ' group, including population from western 
Russia and Fennoscandia, and "Alpine - Centre European" group, including populations from Italy, 
Switzerland, Austria, Slovenia) can be recognized. Balkanian stands were clustered at the boundary 
between the two main gene pools, while centre-European and Carpathian populations were scattered all 
over the two groups. A third, possible group of populations including stands from south-western Alps 
was detected. Lines delimiting gene pools were drawn arbitrarily (from Vendramin et al. 9 2000). 
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Figure 5: Results of the k-means cluster analysis carried out on interpolated haplotype-frequency 
surfaces (from Bucci and Vendramin, 2000). 
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Figure 6: Majority-rule consensus tree obtained by restricted maximum likelihood method (reml) on 
transformed mean haplotype frequencies for each genetic zone. Bootstrap values (N = 1000) are 
indicated at each node (from Bucci and Vendramin, 2000). 
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Figure 7: Relative frequencies of two-locus-combinations (haplotypes) identified in the 17 investigated 
silver fir populations (from Vendramin et al 9 1999). 
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Figure 8: Distogram of the average (D) of the genetic distances d Q (Gregorius 1984) of all pairs of 

individuals belonging to each of seven spatial distance classes. The 90% confidence interval of 1000 
permutations is presented, (a): Distogram of average genetic distance D estimated for locus Pt 30204. 
(b): Distogram of average genetic distance D estimated at locus Pt 71936 (from Vendramin et al 9 1999). 
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30 m 

Figure 9: Schematical drawing of two isolated adult Abies alba trees A and B with surrounding young 
Abies alba trees. Circles represent individual trees, numbers in circles give the size variants in basepairs 
at two chloroplast microsatellite loci Pt 30204 and Pt 71936 (Vendramin and Ziegenhagen 1997b). 
Length variants of Pt 30204 are located at the top, length variants of Pt 71936 at the bottom of each 
circle, the combination yielding the relevant haplotype. Three different symbols (types of shading) were 
chosen: two for the individuals carrying the two different haplotypes of A and B, and one for 
characterizing individuals with all other haplotypes different from A and B (from Ziegenhagen et ai, 
1998). 

© Institut filr Forstgenetik und Forstpflanzenzilchtung, University Gottingen, 1999 
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Abstract 

Molecular marker technologies have eased and potentiated genetic analysis of plants and have become an extremely 
useful tool in forest tree breeding. The information provided by molecular markers has made it possible to 
acquire further knowledge about the structure and organization of plant genomes as well as about the evolution 
of these plant genomes through phylogenetic analysis. Using Populus spp. as a model tree, this paper aims at 
showing and discussing the possible applications of AFLP™, a high-density DNA marker technology developed 
by Keygene N. V. (Wageningen, The Netherlands). Applications include: (i) AFLP analysis of the disease resistance 
against Melampsora larici-populina using bulked-segregant analysis, (ii) AFLP fingerprinting for identification 
and taxonomic analysis of individual trees, and (iii) AFLP-based mapping strategies in Populus. 

Abbreviations: AFLP - amplified fragment length polymorphism; RFLP = restriction fragment length polymor- 
DnT' = P ° lymeraSe Chain reaction; Q 71 - = quantitative trait loci; RAPD - random amplified polymorphic 



Introduction 

The genus Populus shows wide genetic variation. 
Populus spp. are native to the Northern hemisphere 
(North America, Europe, North Africa, and Asia). 
Thirty species of poplars, cottonwoods, and aspens 
have been described. Interspecific as well as intra- 
specific hybrids are easily obtained and often display 
an extensive heterosis effect. 

Poplar is considered a model tree in all aspects of 
forest tree biology. In general, Populus trees are easy to 
grow and can be vegetadvely propagated from stem or 
root cuttings, thereby enabling a rapid multiplication 
for experimental or commercial use. Due to its rapid 
growth, poplar (Populus spp.) has become a tree of 
high economic importance. A wide information about 



the genetic material is available, multi-generation pedi- 
grees exist, and large-scale screenings of parents and 
respective progenies segregating for traits of scien- 
tific or commercial interest have been performed. 
In Europe, poplar wood is used primarily for the 
construction of boxes, pallets, soft board, and multi- 
plex, whereas in the United States and Canada poplar 
wood is mainly processed to pulp and paper. 

Due to the advantageous molecular characteristics, 
genetic manipulations are more accessible in poplar 
than in most forest trees; the nuclear genome is rela- 
tively small (2C = 1 .1 pg) and the chromosome number 
of all species is identical (2n = 38) [5, 42]. Several 
species of poplar can be transformed either by 
Agrobacterium tumefaciens infection or by direct DNA 
transfer procedures [17]. A fast and efficient method to 
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transform P. tremula x R alba by Agrobacterium has 
been described [22]. 

Different molecular techniques, such as restric- 
tion fragment length polymorphism (RFLP) and 
polymerase chain reaction (PCR)-based techniques, 
have been applied with the aim of detecting molecular 
markers associated with qualitative or quantitative trait 
loci (QTLs) of poplar, especially those related to 
disease resistance and economically important traits 
[5, 7]. This approach should result in powerful and 
fast methods for indirect selection. 

The identification of molecular markers associated 
to a particular trait is crucial for the breeder because 
the selection for this trait can then be performed at very 
early stages of development Moreover, the amount of 
progeny plants to be scored can be increased drama- 
tically. This should allow the isolation of individuals 
carrying a range of common traits of interest yet genet- 
ically diverse for the rest of their genome. In this 
way, multiclonal plantations can be obtained where the 
individuals have the same characteristics but remain 
polymorphic. A high biodiversity in the plantation is 
expected to increase the scope of resistance against 
new pathogens. In addition, molecular markers will 
assist breeders in the choice of parents for new breed- 
ing programs. 

Molecular markers facilitate the introgression of 
traits. The aim of different breeding programs is to 
introgress a gene, or genes, of interest from a donor 
parent into an elite parent, generally through back- 
crossing. Molecular markers are used to monitor the 
presence or absence of the locus to introgress in a 
segregating population and also facilitate the selection 
of those back-crossings that are genetically the most 
similar to the recurrent parent [14, 25]. 

High levels of heterozygosity shown by individual 
trees together with the availability of Fi families and 
a molecular marker technology are sufficient to gen- 
erate genomic maps for forest tree species. Genomic 
maps provide information about genome structure, 
organization and evolution. Among related species, 
comparative analysis of genetic linkage maps (syntenic 
mapping) allows the determination, by using heterol- 
ogous probes, of the relative order of genes along the 
chromosomes [1, 2, 4, 19, 20, 21, 38, 40, 43, 44]. The 
generation of high-density molecular maps permits the 
identification of molecular markers tightly linked to a 
locus of interest, which may lead to map-based cloning 
of the gene(s) of interest by chromosome landing [39]. 



Recently, several high-density marker technologies 
have been developed. Among them, AFLP™ 1 is con- 
sidered one of the most powerful technologies. This 
method was developed at Keygene N.V. (Wageningen, 
The Netherlands) by Zabeau and Vos [41, 45]. AFLP 
markers assay the presence/absence of restriction 
enzyme sites and sequence polymorphisms adjacent 
to these sites. Briefly, three crucial steps are followed 
to obtain AFLP markers: (i) digestion of genomic DN A 
with two different enzymes, such as Msel (frequent- 
cutter enzyme) and EcoRI (rare-cutter enzyme); (ii) 
ligation of adapter oligonucleotides to the restriction 
ends; and (iii) selection of fragments by two successive 
PCR-based amplification steps using primers comple- 
mentary to the adapter oligonucleotides with addition- 
ally one to three selective nucleotides. 

AFLP has several advantages over random ampli- 
fied of polymorphic DNA (RAPD). First, a higher 
number of loci can be analyzed per experiment; 
approximately 10-fold the number of informative 
markers are obtained per analysis. Second, AFLP 
markers are co-dominant; by using the appropriate 
equipment and software to analyze the gels, it is possi- 
ble to identify whether an allele is homo- or heterozy- 
gous. This provides more information than dominant 
markers such as RAPDs [41]. Third, AFLP gives 
highly reproducible banding patterns due to a highly 
specific annealing of the primers to the complementary 
adapter oligonucleotides [41]. The RAPD technology, 
on the other hand, can suffer from a lack of repro- 
ducibility which is caused by mismatch annealing of 
the random primers. 

AFLP analysis of disease resistance in Populus 

Disease resistance has always been one of the most 
important selection criteria in poplar breeding. Clones 
selected by the breeders are vegetatively propagated 
to generate monoclonal forests which, unfortunately, 
are known to be fragile with regard to new pathogen 
attacks [30]. 

The most important diseases damaging poplar 
species in Central and Northern Europe are leaf rust 
caused by the fungus Melampsora larici-populina, 
bacterial canker, caused by Xanthomonas populi, and 
leaf spot, caused by the fungus Marssonina brunnea. 
The infection produced by Af. larici-populina causes 
premature defoliation and can reduce growth by more 



1 AFLP is a registered trademark in the Benelux. 



than 20%, Trees defoliated early in the growing season 
become more susceptible to secondary pathogen infec- 
tions and to environmental stress [30]. Repeated infec- 
tions during successive years can result in a complete 
loss of the plantation. 

We have identified molecular markers tightly 
linked to the locus conferring resistance against M. 
larici-populina in Popuius applying the AFLP tech- 
nique [10]. Segregation of the resistance against M. 
larici-populina in the Fi progeny (family 87001) of 
a controlled cross between a resistant female, P. 
deltoides (clone S 9-2), and a susceptible male, P. 
nigra (Ghoy), is consistent with the combination of 
two different factors: a monogenic resistance caused 
by a single dominant resistance gene, resulting in 50% 
resistant progeny, and a multigenic (horizontal) resis- 
tance displayed by different degrees of susceptibility 
shown by the other 50%. Our efforts have focused on 
the identification of molecular markers co-segregating 
with the monogenic resistance. To facilitate the screen- 
ing of family 87001 and to increase the chance of 
identifying molecular markers that are closely linked 
to the resistance locus, we decided to used the bulked 
segregant analysis (BS A) [27] . This method is based on 
bulking DNAs from segregating populations. For our 
analysis, two bulks of DNA were prepared by pool- 
ing equal amounts of DNA extracted from resistant 
plants ("resistant bulk") and from susceptible plants 
("susceptible bulk"). Using 144 primer combinations, 
we identified three different AFLP markers present in 
the resistant parent and the resistant bulk, but absent in 
the susceptible parent and the susceptible bulk. Link- 
age of these markers to the resistance gene was con- 
firmed by AFLP analysis of each individual DNA from 
the family 87001 (Figure 1) [10]. 



AFLP fingerprinting as a tool for identification 
and taxonomic analysis 

Obtaining genomic fingerprints for each species of 
Popuius has become a powerful tool to discriminate 
individual genotypes and to determine phylogenetic 
relationships among these species. Traditional taxon- 
omy based on morphological characters [15], isozyme 
analysis [23, 32, 33, 37] and gas chromatography [13] 
revealed the first estimations of genetic and taxonomic 
relationships between Popuius species but was often 
strongly influenced by the environment. RFLP tech- 
nology has contributed to the identification of DNA 
polymorphisms as genetic markers in Popuius. In this 
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Figure L Identification of an AFLP marker linked to a Mclampsora 
resistance locus in Popuius. (A) The bulked segregant analysis 
(BSA) is presented as a set of four lanes: resistant parent (P A ), 
susceptible parent (P 5 ), resistant bulk (B*), and susceptible bulk 
(B 5 ). (B) The AFLP marker, identified by BSA and indicated by an 
arrow, is present in resistant but absent in suscepdble Fj progeny 
(family 87001). 



way RFLP genome analysis [8, 18, 23], mitochon- 
drial DNA analysis [3, 31], chloroplast DNA studies 
[26, 34-36], and ribosomal DNA analysis [11] have 
been performed in different Popuius species. Recently, 
PCR-based marker systems, such as sequence-tagged 
site (STS), RAPD, or simple sequence repeat (SSR) 
have caused a revolution in genome analysis; two of 
these methods, RAPD and STS have been used in 
genetic studies of Popuius spp. [7-9, 23]. 

Given the high level of genetic polymorphism 
between Popuius species and the advantages offered by 
AFLP analysis (see Introduction), studies on diversity 
between Popuius spp. using AFLP will probably result 
in a highly reliable classification. Such an analysis is 
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being carried out in our laboratory using plant material 
provided by the IBW-Geraardsbergen (former Insti- 
tute for Poplar Culture) in Belgium, which has a large 
collection of species originating from Asia, North 
America, and Europe. Using AFLP, approximately 60 
to 100 markers (bands per lane) can be obtained per 
primer combination and, on average, 60% of these 
bands are polymorphic between different species (P. 
deltoides and P. nigra) and 15-25% between two 
random full-siblings derived from an interspecific 
cross P. deltoides x P. nigra. These data indicate that it 
should be possible to establish reliable taxonomic rela- 
tionships of different Populus spp. as well as different 
genotypes of the same species. 

Using AFLP, patterns characteristic for a specific 
cross, including the fingerprint of an individual tree, 
can be obtained thereby reducing expensive and 
difficult storage of nearly identical material. Well 
managed, this can also be used as a tool to increase 
population diversity. Additionally, fingerprinting of 
individual trees will help the breeder to legally protect 
new genotypes that are to be released on the market. 

Genome mapping in Populus 

Genome maps allow the localization of loci controlling 
monogenic or multigenic traits. Complex trait mapping 
provides information about the number, chromosomal 
location, magnitude of effect and interactions of 
genetic loci controlling the expression of a particular 
trait In the near future, it should be possible to clone, 
by chromosome landing, genes controlling characters 
that are quantitatively inherited [16, 29]. 

The first linkage groups identified in Populus were 
obtained using alloenzymes [24, 28] and RFLPs [24]. 
Recently, a new Populus linkage map has been reported 
[8]. This map contains 343 DNA-based markers, 
combining RFLPs, STSs, and RAPDs, with which an 
F2 progeny of interspecific hybrids was mapped. QTLs 
with large effects on stem growth and form, two impor- 
tant commercial traits, and on spring leaf phenology, 
an adaptive trait, have been mapped [6-8]. 

We are constructing linkage maps of single indi- 
vidual Populus trees using the "two-way pseudo- 
testcross" strategy [12] in combination with the AFLP 
technology. The "two-way pseudo-testcross" mapping 
strategy is based on linkage analysis of those markers 
that segregate 1:1 in an Fi full-sib family, i.e. those 
markers that are heterozygous in one parent and null in 
the other parent. In this way, two genome maps are 



constructed, one for each parent. Given the high 
level of heterozygosity in individual Populus trees and 
the high number of markers generated by the AFLP 
technique, we expect to obtain high-density linkage 
maps of P. deltoides, P. nigra and P. trichocarpa. 
These genome maps will be obtained by analyzing two 
different F { families (87001 and 87002) derived from 
controlled interspecific crosses and sharing a common 
female parent (P. deltoides). They will allow the 
genetic mapping of resistance to pathogens such as 
Melampsora spp., Xanthomonaspopuli, poplar mosaic 
virus and Marssonina brunnea, as well as other 
economically important traits such as those related to 
stem growth, leaf phenology, shape, wood density and 
frost damage. 
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Microsatellite analysis of the regeneration 
process of Magnolia obovata Thunb. 

YUJI ISAGI*t, TATSUO KANAZASHIJ, WAJIROU SUZ.UKI§, 
HIROSHI TANAKA§ & TETSUTO ABE§ 

TKansai Research Centre, Forestry and Forest Products Research Institute, Momoyama, Fushimi, Kyoto 612-0855, 
Japan, %Tohoku Research Centre, Forestry and Forest Products Research institute, Morioka, Iwate 020-0123, Japan 
and % Forest Environment Division, Forestry and Forest Products Research Institute, Tsukuba, 

Ibaraki 305-8687, Japan 

We analysed the regeneration process of Magnolia obovata using polymorphic microsatellite markers. 
Eighty-three adult trees standing in a watershed covering an area of 69 ha, and saplings collected 
from a smaller research plot (6 ha) located at the centre of the watershed were genotyped using 
microsatellite markers. Among 91 saplings analysed, 24 (26%) had both parents, 31 (34%) had one 
parent and 36 (40%) had no parent within the watershed. The proportion of genes in saplings 
inherited from the adults within the watershed was 43%. and therefore 57% were from outside the 
site, indicating active gene exchange across the watershed area. Average distance between parents and 
saplings (264.6 ± 135.3 (SD) m) was significantly smaller than that of pairs randomly chosen 
between adults and saplings (436.7 = 203.0 (SD) m). The distance of pollen movement inferred from 
the distance between the two parents of each sapling ranged from 3.2 m to 540 m with an average 
of 131.1 m ± 121.1 m (SD). Because 34% (=31/91) of saplings had only one parent within the 
watershed, the estimate of average pollen movement must be smaller than the actual one. Long- 
distance seed dispersal by birds, inbreeding depression and limitation in acceptance of pollen because 
of the difference of phenology in each individual flower were considered to be the probable causes of 
large gene exchange across the watershed. 

Keywords: gene flow, pollen dispersal, pollination, seed dispersal. 



Introduction 

Magnolia obovata is a large, common deciduous tree of 
temperate forests in Japan reaching 30 m in height. Its 
large flowers do not secrete nectar, and are primarily 
pollinated by beetles (Thien, 1974) which are thought to 
be less efficient than bees. The flowers are protogynous 
and usually close between the female and male period 
(Kikuzawa & Mizui, 1990). Although the flowering 
period of each flower is 3-4 days, for an individual tree 
flowering persists for up to 40 days (Kikuzawa & Mizui, 
1990). The standing density of adult trees is relatively 
low at a few trees per hectare. In temperate forests in 
Japan, a few dominant tree species often occupy a large 
proportion of the canopy, e.g. Fagus crenatq, and the 
rest of the canopy is composed of tree species occurring 
at relatively low density such as Kalopanax pictus, 
Cornus controversa, Aesculus turbinata, Magnolia 
obovata, Magnolia salicifolia and Pterocarya rhoifolia. 

'Correspondence. E-mail: isagi yuv fsm.atfrc.go.jp 



Although each species is at a relatively low density, the 
assemblage of these species is dominant as a whole, and 
determines the structure and diversity of the forest 
ecosystem. For such species, it is important to analyse 
the extent of pollen movement and seed dispersal to 
elucidate the regeneration process, mechanisms that 
maintain biological diversity in forest tree communities 
and also for conservation purposes. 

The pattern and degree of gene dispersal can affect the 
genetic structure of plant populations (e.g. Schaal, 1980; 
Ellstrand, 1992; Hamrick et al. t 1992), and in higher 
plants, gene dispersal occurs at reproduction through 
pollen and seeds. Microsatellite loci are ideal for 
quantifying pollen- or seed-mediated gene transfer in 
natural plant populations because of their codominant 
inheritance and high variability. Therefore, they should 
provide high exclusion probabilities for paternity 
assignment. We have developed 1 1 microsatellite marker 
loci in Magnolia obovata (Isagi et ai, 1999) to assign 
parentage and examine gene transfer of this species. In 
the present study, we will estimate pollen and seed 
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dispersal distances and the magnitude of gene transfer in 
a population where adult tree density was 1.2 ha" . 

Materials and methods 

Field site 

Field research was conducted in Ogawa Forest Preserve, 
[baraki Pref, central Japan (36 0 56"N, I40°35'E). The 
elevation of the research area was 6 10-660 m a.s.l. and 
annual mean air temperature and annual precipitation 
were 9°C and 1800 mm, respectively. Dominant woody 
species in the canopy were Quercus serrata. Fagus 
japonica and Fagus crenaut. etc. We established two 
research plots, plots A and B. in the preserve (Fig. 1). 
Plot A covered the whole watershed area of 69 ha. 
Within this plot all of the adult trees of M. ohovata 
were located and the diameters at breast height (d.b.h.) 
were measured. The other plot, plot B. occupied an 
area of 6 ha (200 x 300 m), located at the centre of 
plot A (Fig. I). In plot B. intensive studies on the 




Fig. 1 Map showing the distribution of adults and saplings 
of Magnolia obovara analysed with lines drawn between 
parents and their offspring. 



plant community structure, community dynamics 
(Nakashizuka et al., 1992) and population dynamics 
of various tree species such as Carpinus (Shibata & 
Nakashizuka. 1995), Acer (Tanaka, 1995) and Comus 
(Masaki et aL. 1994) have been made. 

Sampling 

Leaf samples from all of the reproductive adult trees (83 
trees) of M. obovctta in plot A were collected. In plot B, 
leaf tissue was sampled from 91 saplings. During leaf 
collection, the position of each tree was mapped. Leaf 
samples were stored at -70°C prior to DNA extraction. 

DNA extraction and microsatellite analysis 

Crude genomic DNA of M. ohovata was extracted using 
the CTAB method (Milligan. 1992). Genotypes of each 
DNA sample were scored using eight pairs of microsat- 
ellite PCR primers developed by Isagi et al. (1999). PCR 
amplifications were performed, using a thermal cycler 
(GeneAmp PCR System 9600. ABl). under the following 
conditions: initial denaturing at 94°C for 9 min. then 30 
cycles of denaturation at 94°C for 30 s. annealing for 
30 s, and extension at 72°C for I min. followed by a final 
incubation at 72°C for 7 min. The volume of the reaction 
mixture was 10 fiL containing 10 ng of DNA from 
M. obovata. 5 pmol of primers labelled with fluorescent 
phosphoramidites (TET or 6-FAM). 0.25 U of Taq 
polymerase (Ampli TaqGold. ABI), 200 jum of each 
dNTP, 1.5 m.M of MgCl 2 , 10 mM of Tris-HCL pH 8.3, 
50 mM of KC1 and 0.001% of gelatin. The PCR products 
were resolved on a 5% denaturing polyacrylamide gel. 
and the sizes were determined by automated fluorescent 
scanning detection with the autosequencer ABI377 and 
GeneScan™ analysis software (ABI). 

Parentage analysis 

Parentage was assigned by comparing alleles between a 
sapling and candidate parents (Dow & Ashley, 1996) 
using genotype data at eight microsatellite loci: M6DI, 
M6D3, M6D4 y M10D3, M10D6, MI0D8, M15D5 and 
MI7D5 developed by Isagi et al. (1999). Alleles at every 
locus of each sapling were compared with those of adult 
trees, and adults which did not share any alleles at each 
locus were excluded as candidate parents. 

Results 

Analysis of parentage 

The eight microsatellite markers were highly variable, 
and sufficiently informative to conduct the analysis of 
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Table ! Allele frequencies, numbers of heterozygotes and homozygotes. observed and expected heterozygosities at eight 
microsatellite loci in 83 adults and 91 saplings of Magnolia ohovuta 
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Table l [Continued) 
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1.7 


201 


0.6 




215 


0.6 


0.6 


No. of alleles 


30 


25 


Heterozygotes 


80 


89 


Homozygotes 


3 


1 


H 0 


0.96 


0.99 


H e 


0.95 


0.94 



Locus M 
289 
293 
295 
297 
299 
301 
303 
305 
307 
313 
317 



No. of alleles 
Heterozygotes 
Homozvgotes 



parentage. The number of alleles for each locus ranged 
from six [M15D5) to 36 (M6D4) with an average of 20.3 
for adults and from seven (M15D5) to 30 (M6D4) with 
an average of 18.8 for saplings (Table 1). The number of 
alleles unique to adults and saplings was 30 and 19, 
respectively. 

Among 91 saplings found in plot B, 55 had at least 
one parent (first parent) in plot A whereas 36 had no 
parents within the watershed. Among the 55 saplings, 24 
had the second parent as an exact match, and 31 had 
only one parent in plot A. Out of the 31 saplings which 



had only one parent in plot A, 23 had only one 
candidate as an exact match, and eight had multiple 
matches with two candidates for one parent. No sapling 
seemed to be the product of self-pollination of adults in 
plot A. 

In order to estimate the amount of gene flow into the 
watershed, it is important to evaluate the amount of 
cryptic gene flow which reflects the possibility that a 
sapling identified as having a parent within the research 
plot actually had the parent outside the plot (Dow & 
Ashley, 1996). Using the allele frequencies at the eight 
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Table 2 Number of saplings having the first and second parents within or outside 
plot A. and values of cryptic gene Mow for Magnolia ohovaia 





First parent 


Second parent 


Total 


Saplings having parents 








within plot A 


55 (53.95) 


24 (23.99) 


79(77.94) 


Saplings having parents 








outside plot A 


67 (68.05) 


36 (36.0I) 


1 03 (104.06) 


Cryptic gene flow 


1. 05 


0.0 1 





Figures in parentheses are numbers of saplings corrected for the values of cryptic gene How. 
Because the number of saplings analysed was 91. figures in the column "Total* for either 
saplings having parents within or outside plot A should sum up to 182 ( = 2 parents x 91 
saplings). 



microsatellite loci and the formula of Marshall et al. 
(1998). the probability of excluding a single randomly 
chosen unrelated individual in plot A from parentage 
was determined as 0.999769 for the first parents and 
0.999995 for the second parents. Therefore, the 
probability of excluding correctly all unrelated adults 
(83 trees) within plot A was 0.999769 s3 = 0.98 1 0 for the 
first parents and 0.999995 s3 = 0.9996 for the second 
parents. The number of saplings which had the first and 
second parents in plot A was 55 and 24, respectively 
(Table 2), so that the amount of cryptic gene flow was 
estimated as 55(1 - 0.9810)= 1.05 for the first parent 
and 24(1 - 0.9996) = 0.0! for the second parent. There- 
fore, the total gene flow events from outside plot A into 
plot B corrected for cryptic gene flow were 68.05 + 
36.01 = 104.06 (Table 2). Among 182 possible parents 
( =91 saplings x 2) of saplings in plot B, 104.06 (57%) 
were outside plot A, indicating active gene flow across 
the watershed. 

Distance between parents and saplings 

Distance between parents and saplings, which repre- 
sents either seed dispersal from maternal parents, and 
pollen movement plus seed dispersal from paternal 
parents, was large (Fig. 1), ranging from 32.4 m to 
563.2 m with an average of 264.6 m ± 135.3 m (SD) 
(Fig. 2a). Although the value indicates active pollen 
and seed dispersal in the research site, the distance is 
limited to saplings for which parentage has been 
assigned within plot A. Therefore, it probably repre- 
sents an underestimate of the true distance because 
57% of the parents of saplings were outside plot A. 
The distance between random pairs of adults in plot 
A and saplings in plot B ranged from 10.3 m to 
933.8 m with an average of 436.7 m ± 203.0 (SD) m 
(Fig. 2b), and was significantly greater than that of 
the distance between offspring and parent trees 
(tf-test, P < 0.0001) (Fig. 2). This indicates that trees 
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Fig. 2 Histogram of distances for Magnolia obovata (a) 
between parents and progeny inferred with microsatellite 
markers, and (b) between adult trees and saplings randomly 
chosen in the research site. 



at a closer distance contribute more as parents of the 
saplings. 

Adult d.b.h. 

Diameter at breast height (d.b.h.) of adult trees ranged 
from 5.2 cm to 59.1 cm with an average of 28.3 cm. 
The average d.b.h. of adult trees that had progeny in 
plot B was 35.8 cm, and was significantly larger (6'-test. 
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Fig. 3 Diameter at breast height (d.b.h.) of adult Magnolia 
obovauL (a) d.b.h. of adults having their progeny in plot B. 
(b) d.b.h. of adults not having their progeny in plot B. 



P < 0.0001) than that of adults not having their 
progeny in plot B (24.5 cm), indicating that adults of 
larger size contributed more as pollen donors or seed 
parents to saplings in plot B (Fig. 3). 

Pollen movement 

Although we determined two parents of exact match 
for 24 saplings, it was impossible to distinguish which 
acted as pollen donor or seed parent of these 
saplings. Therefore, we can not infer the real distance 
of seed dispersal by merely determining parent- 
offspring relationships. However, based on the dis- 
tance between two parents of exact match, we can 
infer the extent of pollen movement. The distance of 
pollen movement ranged from 3.2 m to 540 m with 
an average of 131.1 m ± "121.1 m (SD). About 27% 
of pollination was carried out between nearest 
neighbours. 

The distances between random pairs of adults in the 
watershed showed a flat distribution, ranging from 
1.3 m to 1543.7 m with an average of 561.5 m (Fig. 4c). 
The distance between the nearest neighbours for each 
adult tree within the watershed was low; 93% of trees 
had their nearest neighbour within the range of 100 m 
(Fig. 4a). The average distance of pollen movement was 
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4 Histogram for Magnolia obovatu of distances (a) to 
nearest neiehbours for each adult tree, (b) of pollen movement 
inferred from microsatellite analysis, and (c) between random 
pairs of adult trees in plot A. 



significantly larger ((/-test, /> = 0.0010) than the average 
distance between nearest neighbours for each adult tree 
(44.1 ± 37.5 (SD) m), and significantly smaller (D'-test, 
P < 0.000 1) than that between random pairs of adult 
trees (561.5 =b 352.6 (SD) m) (Fig. 4c). This indicates 
that pollination occurs between adults located at closer 
than the average distance between adult trees within the 
watershed, but is not always between nearest neighbours 

The average distance between parents and their 
offspring (264.6 m; Fig. 2a) was significantly greater 
(CZ-test, P < 0.0001) than that of pollen movement 
(131.1m; Fig. 4b), reflecting that the former dis- 
tance consists of pollen movement and seed dispersal. 



Discussion 



Pollen dispersal 

Movement of pollen grains and seeds from point sources 
is known to show a leptokurtic or limited distnbut.on 
(Sork, 1984; EUstrand, 1992; Webb, 1998): many are 
dispersed near the source and there is a long tail of fewer 
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Fig. 5 Relative positions for Magnolia ohovara of pollen 
donors and seed parents between which pollination occurred. 
Position I indicates that the tree is the nearest to the other 
parent among adult trees, and position // indicates that there 
are n - I trees between the two trees. 



pollen grains and seeds at greater distances. However, 
the distribution of distances between parents and 
offspring inferred for M. obovata with microsatellite 
markers in the present study was not ieptokurtic. 
Different pollen vectors and patterns of behaviour 
■ remarkably affect pollen dispersal (Schmitt. 1980; 
Waser, 1982; Hamrick et al., 1992; Webb, 1998). The 
pollinators for M. obovata are primarily beetles (Kiku- 
zawa & iMizui, 1990), which are thought to be less 
efficient as pollen vectors (Ramsey, 1988)? However, the 
distance of pollen movement in the present stand was 
quite large (average 131.1 m with a maximum value of 
540 m) (Fig. 4b). Tanaka & Yahara (1988) have shown 
that a variety of insects, other than beetles, i.e. butter- 
flies and bumble bees, pollinated /V/. obovata at a site in 
central Japan. It is feasible therefore that such pollina- 
tors were also effective and account for the long-distance 
pollen movement in the present stand. About one-third 
of saplings ( = 31/91) found in plot B had only one of 
two parents within plot A. Hence, the average value of 
pollen movement must be an underestimate and the 
maximum distance may be more than 540 m. Chase 
et ai (1996) analysed the range of pollen dispersal for 
a tropical tree species, Pithecellobium elegans, using 
microsatellite markers. They found that average pollen 
dispersal was 142 m with a maximum value of 350 m. 
The average distance of pollen movement presently 
inferred for M. obovata is almost equivalent to that for 
P. elegans. Adult trees of M. obovata resemble P. elegans 
m that both species tend to occur at low density. It is 



possible that for tree species occurring at low densitv ( n 
natural communities pollination regularly occurs over a 
wide range. However, long-distance pollen Mow for such 
tree species might be affected by various life cvcle 
characteristics, habitat type, and type of pollen vectors. 
This will need to be examined in the future. 

Seed dissemination 

Seed dispersal characteristics affect the range of seed 
dissemination (Hamrick & Murawski, 1990). with short 
distances occurring for gravity dispersal and lonser 
ones for animal and wind dispersal (Hamrick et ai 
1992). 

Diaspores of M. obovata have a red fleshy edible 
part, and are dispersed internally by birds. Distance of 
seed dispersal by birds has been considered to be quite 
long, for example blue jays carried acorns of Fagus 
grandijblia up to 4 km from the source (Johnson & 
Adkisson. 1985); however, few studies have measured 
this trait. Instead of direct measurement, the range has 
been estimated, for example, by determining the home 
range of birds (e.g. Fukui. 1995). Using appropriate 
microsatellite markers, and assuming that saplings with 
no possible parents within the research plot might grow 
from seeds pollinated outside the research plot and 
carried in by birds, we can infer the approximate range 
of seed dispersal by birds. It is^notable that 40% (=36/ 
91) of the saplings in plot B had no parents within the 
69 ha research site (plot A), and the large proportion 
of these saplings reflects the active seed dispersal of 
this species by birds. The range of seed dispersal seems 
to reach more than several hundred metres and is 
significantly greater than that reported for seed dis- 
persal by gravity or mammals such as mice and 
monkeys, whose dispersal ranges are within 100 m 
from the source (Sork, 1984; Jensen, 1985; Iida, 1996; 
Yumoto etat., 1998). Dow & Ashley (1996) analysed 
acorn dispersal of Ottercus macrocarpa using micro- 
satellite markers, and found that seed dispersal of 
2- rnacrocarpa was limited compared with pollen 
movement; most seed dispersal was less than 30 m 
whereas pollen dispersal averaged 76.9 m. However, 
they also stated that long-distance seed dispersal was 
not so rare as previously estimated: 48% of the seeds 
were dispersed secondarily by animals, and among 
them 16% were dispersedLmore than 90 m away from 
the source. They also estimated the maximum fre- 
quency of saplings which had no parents within the 
research plot (about 5 ha) at 14%. In contrast, despite 
the much larger plot size for the present population of 
M. obovata (69 ha), the proportion of saplings without 
either parent within the research plot was larger (36/ 
91=40%). 
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Factors that cause large gene transfer 
for M. obovata 

In natural plant populations, it has often been observed 
that actual gene flow occurs over greater distances than 
expected allowing for the leptokurtic movement of 
pollen and seeds. Several factors are thought to account 
for this discrepancy, namely, the cumulative contribu- 
tion of pollen by means of leptokurtic but long-tailed 
distribution (Adams, 1 992; Ellstrand, 1992), underesti- 
mation of gene dispersal by neglecting carry-over of 
pollen grains on vectors (SchaaL 1980; Levin, 1981), and 
inbreeding depression (Waser. 1993). 

If a population is genetically structured, and inbreed- 
ing or outbreeding depression . occurs based on the 
genetic relatedness of adult trees, some kind of selection 
on pollen grains could occur. Dow & Ashley (1996) 
presumed the existence of a mechanism allowing female 
flowers of 0. mcicrocarpa to select preferentially pollen 
from distant sources rather than pollen produced by 
neighbouring trees. It is known that M. obovata suffers 
from high inbreeding depression (Ishida & Nakamura. 
1997), and consequently, pollination between less related 
or spatially distant trees of M. obovata might be favoured 
in spite of the leptokurtic nature of pollen dispersal. 

Chase et aL (1996) revealed that most mating events 
in Pitheceilobium elegans were not between the closest 
neighbours, because of variation in phenology or in 
flowering behaviour between adult trees. Many tree 
species show large fluctuations in flowering among years 
with or without synchronization between trees in a 
population (Kelly, 1994; Isagi et aL, 1997). In the case of 
episodic flowering without synchronization in a popu- 
lation, only some of the trees in the population can 
contribute to reproduction in a given year, and this may 
result in pollination between distant trees. Mating events 
in M. obovata were also not usually between nearest 
neighbours: more than 70% of pollination occurred 
between non-nearest neighbours (Fig. 5), and this might 
also stem from differences in phenology of individual 
flowers. Although M. obovata has a long flowering 
period, up to 40 days, the longevity of each protogynous 
flower is a few days: duration of the consecutive female 
and male stages is about 1-2 days each. And in most 
cases, only several or fewer flowers on an adult tree 
bloom in a given day during the flowering period. 
Therefore, even within a flowering season, each individ- 
ual tree may switch among male, female and bisexual 
phases, and thus not all trees in a population : can 
contribute as pollen donors to flowers in the female 
stage at the same time. 

For M. obovata (i) long-distance seed dissemination 
by birds (ii) inbreeding depression and (iii) limitation in 
acceptance of pollen for each tree caused by differences 



in flowering period of each individual flower, are likely 
to cause long-distance gene flow and increase gene 
exchange between less related or distant trees. Although 
the present population of M. obovata is in a physically 
distinct landscape component — a watershed — the 
amount of gene flow from outside the watershed was 
sufficient to prevent genetic differentiation by means of 
genetic drift.. This agrees with the fact that most tree 
species do not exhibit much genetic differentiation 
among populations: usually more than 90% of the total 
genetic variation is found within each population 
(Ledig, 1986). 
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Abstract 



The accurate assignment of paternity in natural plant populations is required to address important issues 
in evolutionary biology, such as the factors that affect reproductive success. Newly developed molecular 
tmgerpnnting techniques offer the potential to address these aims. Here, we evaluate the utility of a new 
"17 muto-locus fingerprinting technique called Amplified Fragment Length Polymorphism 
(AhLP) for paternity studies in Persoonia mollis (Proteaceae). AFLPs were initially scored for five 
indmduals from three taxonomic levels for 64 primer pairs: between species (P. mollis and P levis) 
between subspecies (P. mollis subsp. nectens and subsp. livens), between individuals within a single 
population of P mollis as well as for a naturally pollinated seed from a single P. mollis subsp. nectens 
n?L?l 1 M (24.6% of all fragments) were polymorphic between species 743 

(1 6.5%) between subspecies, 37 1 (8.6%) between individuals within a single population and 265 (6 2%) 
between a plant and its seed. Within a single P. mollis population of 14 plants. 42 polymorphic 
fragments were scored from profiles generated by a single AFLP primer pair. The mean frequency of the 
recessive allele (q) over these 42 loci was 0.773. Based on these observations, it will be feasible to 
generate well over 100 polymorphic AFLP loci with as few as three AFLP primer pairs This level of 
polymorphism is sufficient to assign paternity unambiguously to more than 99% of all seed in 
experiments involving small, known paternity pools. More generally, the AFLP procedure is well suited 
to molecular ecological studies, because it produces more polymorphism than allozymes or RAPDs but 
unlike conventionally developed microsatellite loci, it requires no prior sequence knowledge and 
minimal development time. 

Introduction 

The transmission of genes from one generation to the next by sexual reproduction is a 
process of major evolutionary significance, as it influences the genetic makeup of future 
generations. Despite its significance, many fundamental questions remain unanswered For 
example, what are the factors that influence reproductive success? In plants, do some pollen 
donors sire more seeds than others in natural populations? What is the intensity of post- 
polhnation selection? What are the evolutionary consequences of variable reproductive 
success? Answers to these sorts of questions require the accurate assignment of male 
parentage to progeny using genetic markers in conjunction with appropriate experimental 
manipulations of natural populations. 

We are conducting experiments to address some of these questions in natural populations 
of Persoonia mollis R.Br. (Proteaceae), a long-lived, fire-sensitive shrub, occurring up to 
c. 250 km south and up to c. 150 km west of Sydney, New South Wales (NSW), Australia 
Currently, nine subspecies are recognised (Krauss and Johnson 1991). Flowers are borne 

foS W ™ In thC aX ' 1S ° f leaVeS ' and are P° llinated °y a s «'te of bees (Bernhardt and Weston 
1996). The species is completely outcrossing due to a 'pseudo' self-incompatibility 
mechanism that prevents almost all self-pollen tube growth in the style, coupled with 
preferential I development of outcrossed seed (Krauss 1994a, b). Ovaries contain two ovules 
only one of which usually matures into. seed. Since individual stigmas of P mollis usually 
receive mixed pollen loads (Krauss 1994a), and, given that there is usually only one (male 



© CSIRO 1998 



0067-1924/98/030533 



534 



S. L. Krauss and R. Peakall 



] 



genotype) winner per ovary, there is enormous potential for post-pollination selection 
through male-male competition among pollen and/or female choice, both of which can affect 
male reproductive success (Snow 1994). To measure the intensity of post-pollination 
selection, we are conducting pollination-manipulation experiments in a small natural 
population of P. mollis subsp. nectens at Sublime Point, NSW. Final results of this 
experiment will be presented in detail elsewhere, once seeds have been assigned paternity. 
Our purpose here is to describe the genetic methodology and to evaluate its utility for 
paternity analysis in this system. 

Allozymes have been used to assign paternity, although only rarely in natural plant 
populations (e.g. Ellstrand and Marshall 1986; Adams et al. 1992). In P. mollis, for example, 
only 61 of 940 (6.5%) seeds genotyped in natural populations could be assigned paternity 
unambiguously using allozymes (Krauss 1994a). Similarly, in Chamaelirium luteum, only 55 
of 2255 (2.4%) seeds could be assigned paternity unambiguously (Smouse and Meagher 
1994). Except in special circumstances, for example where plants have singly-sired 'clutches' 
(e.g. milkweeds, Broyles and Wyatt 1990), allozymes are (at best) only sufficiently variable 
to partially assign paternity (Chakraborty etal 1988). Clearly, more variable genetic markers 
are required. Below, we briefly introduce some of the new molecular techniques that may 
enable unambiguous paternity assignment in natural plant populations, before evaluating one 
of these methods (AFLPs) for this purpose. 



Brief Background to Molecular Markers 

The revolutionary advent of the polymerase chain reaction (PCR) (Mullis and Faloona / 

1987; Saiki et al. 1988) has enabled the development of new methods that overcome the a 

limitations of allozymes and RFLP methods, and in theory allowed access to a vast array of f 

genetic markers. The PCR is a procedure that can exponentially replicate either single-locus li 

or multi-locus DNA, resulting in up to a million- fold amplification (Amheim et al. 1990). a 

The PCR reaction consists of template DNA, DNA polymerase, two synthesised primers o 

(typically 8-30 nucleotides in length) that anneal to complementary regions of template k 

DNA, and other additives in a buffer. This reaction mix is subjected to repeated temperature li 

cycles in a thermocycler. First the mix is heated to 94 °C to separate the double-stranded o 

DNA. Next the temperature is lowered (to between 40°C and 60°C) to allow the primers to b; 

anneal to their complementary sequences. Finally, the temperature is raised to 72°C to allow al 

the addition of nucleotides between the flanking primers by the polymerase. The new double- fc 

stranded DNA fragments serve as the starting point for the next cycle, and fragment copy hi 

number thus increases exponentially. From small amounts of starting material, PCR can yield el 

sufficient DNA for visualisation on an electrophoretic gel. Specific fragments that differ in pc 

length can be readily detected as genetic markers that vary both within and among te» 
individuals (Arnheim etal. 1990). 

The many variations of the basic PCR procedure can be broadly defined by three major t° 

approaches. Firstly, sequence-tagged-site (STS) PCR uses two different specific primers, pl J 

complementary to opposite strands of conserved DNA flanking regions, to amplify the D< 

intervening sequences. Microsatellites, or simple-sequence-repeats (SSRs), are a type of STS m( 

marker. SSRs consist of tandemly repeated units of a short nucleotide motif one to six base we 

pairs long. Dinucleotide repeats (e.g. CACACA ...), trinucloetide repeats (e.g. po 
AATAATAAT ...), and tetranucleotide repeats (e.g. GATAGATAGATA ...) are the most 

common, and are widely distributed throughout the genomes of eukaryotes (Jarne and Mi 
Lagoda 1996). Different alleles show different numbers of repeat units, which manifest as 
length polymorphisms on a gel. Being co-dominant markers, both heterozygotes and 

homozygotes can be detected at single loci. The utility of SSRs results from their inherent des 

multi-allelic variability. In humans, for example, heterozygosities typically exceed 50%, with of z 

one 
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up to 50 alleles per locus (Weber 1990). Similar patterns of variability at SSR loci have been 
™ mma,s ' birds - insects and Plants (Queller et al 1993; Chase etal 1996; Gupta et 
al 1996; Koreth et al. 1996; Primmer et al. 1996). By virtue of their variability and co- 
dominance, SSRs are considered ideal genetic markers for a wide range of applications 
'fl cf D Patemity analySiS " 71,6 major ^^tage of SSRs is that sequences on either side 
or the 2>SRs must be known to design appropriate primers for PCR assay. Thus, a substantial 
investment in time and cost is required to develop these markers via the construction of 
genomic libraries and DNA sequencing. 

A second PCR approach overcomes the need for sequence knowledge, but brings its own 
hm.tations. Williams etal. (1990) and Welsh and McClelland (1990) showed that by using a 
single short pnmer of known but arbitrary sequence under low-stringency PCR conditions 
polymorphic multi-locus DNA profiles can be produced. Although other names exist for this 
procedure, it is most widely known as Random Amplified Polymorphic DNA (RAPDs) In 
genera RAPDs are dominant with polymorphism revealed as band presence or absence 
lypicaHy, 5-20 bands are produced per PCR reaction, with each band assumed to represent a 
single locus. While dominance reduces the information content per locus, RAPDs tend to 
7 oo<! "£ 0re variation than allozymes, because many more loci can be assayed (Peakall et al 
1*95). Concerns have been raised, however, about run-to-run repeatability and PCR artefacts 
which may occur as a consequence of the low-stringency conditions used (Morell et al. 

A third PCR approach combines the features of both the STS and RAPD techniques The 
Amplified Fragment Length Polymorphism (AFLP) procedure is based on selective PCR 
amplification of particular restriction fragments from a total digest of genomic DNA (Lin and 
Kuo 1995; Vos et al. 1995). The basic method involves restriction of the genomic DNA 
ligation of oligonucleotide adapters to the DNA fragments, and high-stringency selective 
amplication of a subset of all the fragments in the total digest. The ligation of 
oligonucleotide adapters enables PCR to be performed for any species without prior sequence 
knowledge. The selective amplification uses primers of complementary sequence to the 
fT ^ Pter ? 1US ° ne 10 *"* additiona ' arbitrary nucleotides. Subsequent electrophoresis 
or the PCR product typically reveals a complex multi-locus profile of up to 100 bands The 
bands aregenerally dominant markers, with polymorphism detected as band presence or 
absence. The AFLP method has a number of significant advantages over RAPDs First and 
foremost, the PCR conditions are far more stringent and optimised for plants, producing 
highly reproducible profiles. When combined with fluorescently labelled primers 
electrophoresis on an automated sequencer and computer data storage and analysis the 
tedEquT* 1 31,01 aCCUiaCy ° f ^ pr0CedUre is unsur Passed for a multi-locus fingerprinting 

Although a new technique, AFLPs have already been used to assess genetic relationships 
to quantify tevels of genetic diversity, and to identify cultivars of agriculturally important 
plants (e.g. Hill et al 1996; Maughan et al 1996; Powell et al 1996; Sharma et al 1996- 
Doniru et al 1997; Qi and Lindhout 1997; Van Toai et al. 1997). In this paper, we consider 
the utility of AFLPs for ecological genetic studies in natural plant populations. Specifically 
we evaluate polymorphism generated by AFLP for paternity assignment in natural 
populations of P. mollis. 



Materials and Methods 

DAW Extraction 



rfeJS.T£ was '* >,ated *™ <=■ 30 mg of fresh new leaf using a modified CTAB procedure 
i^ n ^rT ^ VM (1993) ' aad 38 ■W** 1 ^ AFLP by Vos etal. (1995), but with the addition 
of a phenol-chloroform step. DNA was similarly extracted from a naturally pollinated seed harvested torn 
one plant m the study population. DNA samples were stored in low Tris-EDTA (T^toffoaS^ 
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Amplified Fragment Length Polymorphism (AFLP) 
The AFLP procedure involves four steps (Fig. 1). 

(t) Restriction of the DNA. For each sample, approximately 200 ng of DNA was digested with 2.5 
units of EcoRI/Msel restriction enzyme in a reaction volume of 25 uL, and incubated at 37°C for 
2 h. Samples were then transferred to a 70°C bath for 15 min, before briefly cooling on ice. 

(«*) Ligation of adapters. One unit of DNA ligase and 24 uL of adapter ligation solution were added 
to the digested DNA from (i), incubated at 20°C for 2 h, then diluted 1:10 with TE buffer. 

(Hi) Preselective amplification by PCR. Five uL of the diluted ligation mix was combined with 40 ul 
pre-amplification primer solution, 5 ul 10X PCR buffer for AFLP, and 2.5 units Taq DNA polymerase 
in a PCR plate and PCR was performed for 20 cycles of 94°C for 30 s, 56°C for 60 s, and 72°C for 60 s. 
Subsequently, the pre-amplification mixture was diluted 1 : 50 with TE buffer. 

0*v) Selective amplification by PCR. This step uses primers that match the known adapter sequence, 
plus three selective nucleotides, to reduce the complexity of the profile. For each of 64 primer pairs 
(Table 1), the following was added for 2.5 uL of each diluted, pre-selective DNA sample from (iii): 
7.5 ng EcoRI-primer, 15 ng Msel-primer, 3.95 uL MilliQ water, 1.0 uL I0X PCR buffer, and 0.25 units 
Taq DNA polymerase. A touchdown PCR reaction commenced with one cycle of 94°C for 30 s, 65°C 
for 30 s, and 72°C for 60 s. In subsequent cycles, the annealing temperature was reduced in 1°C steps to 
56°C, followed by 23 cycles at 56°C. All solutions, except for Taq DNA polymerase, come in kit form 
and were purchased from Life Technologies. Fluorescently labelled primers were purchased from Perkin 
Elmer. PCR was performed on a Corbett Research FTS-960 thermal sequencer. 



AFLP procedure for 1 primer pair 
1. digestion of DNA 



3. pre-selective amplification by PCR 



-CAATTC- 
- CTTAAC - 



- EcoRI & Msel 



AATTC- 
C- 



-TTAA- 
- AATT- 



( EcoRI primer + 1) 
5* A 



AATTC- 
TTAAG- 



-T 

-AAT 



2. ligation of adapters 



I AATTCA ■ 
I TTAACT . 



TTA I 

AAT| 

C 5* 

( Msel primer + 1) 



* CTTA I 

.caat| 



TTAA5 
EcoRI adapter 



4. selective amplification by PCR 



5TA 



+ Msel adapter 



( EcoRI primer + 3) 
5' AAC 



AATTC- 
TTAAC- 



AATTCA— 
TTAACT- 



-TTA 
-AAT 



-GTTA 
-CAAT 



AAC 5" 

( Msel primer + 3) 



AATTCAAC- 
I 'JTAAGTTC - 



-TTCTTAl 
-AACAAT | 



■ EcoRI adapter sequ< 
(fluorescently labelled) 



" Msel adapter sequence 



Fig. 1. The four major steps of the AFLP procedure. See text for a detailed explanation. 



Markers for Paternity Analysis in Persoonia mollis 



537 



Table 1. AFLP selective amplification primers used here 
to generate 64 AFLP primer pairs 



Fluorescently labelled primers 



Unlabelled primers 



EcoRI-ACT 
EcoRI-ACA 
EcoRI-AAC 
EcoRI-ACC 
EcoRI-AGC 
EcoRI-AAG 
EcoRI-AGG 
EcoRI-ACG 



Msel-CAA 
Msel-CAC 
Msel-CAG 
Msel-CAT 
Msel-CTA 
Msel-CTC 
Msel-CTG 
Msel-CTT 



The fluorescently labelled amplified fragments were analysed by gel electrophoresis (5% acrylamide 
gels), using the ABI Prism 377 Automated Genetic Analysis System (AGAS). In this system, fragments 
are detected by laser, and are accurately sized by the inclusion of internal size standards that are labelled 
with a uniquely coloured fluorescent dye. Laser technology and fluorescent labels permit more efficient 
electrophoresis, allowing visualisation of up to three different PCR products (with different coloured 
fluorescent primers) per lane, with the fourth colour devoted to the size standard. Digitally converted 
raw data are saved on the computer as samples migrate past the fluorescence detector. Multi-locus 
profiles were visualised using ABI GeneScan software. AFLP profiles were scored for presence or 
absence of fragments, as well as for fragment intensity, as revealed by peak height, with the aid of ABI 
Genotyper software. 

To maximise the efficiency of this preliminary evaluation, we first contrasted AFLP profiles for 64 
primer pairs for each of five individuals for four hierarchical levels of comparison. The contrasts were 
between a single plant of P. mollis subsp. nectens from Sublime Point and (i) its seed, (ii) a co-occurring 
plant, (iii) />. mollis subsp. livens, and (iv) />. levis. To test the power of the AFLP method to distinguish 
between genotypes within populations, we chose two individuals of P. mollis at Sublime Point that had 
identical allozyme phenotypes (Krauss 1997). Replicate extractions and AFLP analyses were made for 
each plant. The number of fragments of lengths between 80 and 500 bases, the number of polymorphic 
fragments, and the percentage of all fragments that were polymorphic, were scored for each primer pair. 
Secondly, a more detailed investigation was made for one of the most polymorphic AFLP primer pairs 
for 14 plants within the Sublime Point population of P. mollis subsp. nectens. The number of 
polymorphic fragments and the frequency of each of these fragments were scored. 



A portion of one multi-locus profile, as created by GeneScan software, for each of two co- 
occurring individuals of P. mollis subsp. nectens, shows five clear polymorphic fragments 
distinguishing these individuals (Fig. 2). The quality of individual multi-locus profiles was 
occasionally affected by poor PCR reactions or by variable gel conditions, such as bubbles in 
the gel matrix and sub-standard wells for loading samples. However, these factors reduced 
peak heights (and therefore yielded a weaker profile), rather than changing the quantitative 
nature of the profile (presence or absence of peaks) in replicate samples. Poor profiles were 
re-run. 

On average, approximately 70 fragments were scored per primer pair. In total, 64 AFLP 
primer pairs generated 4722 fragments, of which 1164 (24.6%) were polymorphic for the 
inter-species comparison between one plant of P. mollis and one plant of P. levis (Table 2). 
For the inter-subspecies comparison between one plant of P. mollis subsp. nectens and one 
plant of subsp. livens, 143 fragments (16.5%) were polymorphic from a total of 4509 
fragments. From a total of 4323 fragments, 371 (8.6%) were polymorphic for the within- 
population comparison between two plants of P. mollis subsp. nectens at Sublime Point The 
mother-naturally pollinated seed comparison generated 4279 fragments, 265 (6.2%) of which 
were polymorphic (Table 2). 
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Fig. 2. A portion of the AFLP profile as displayed by GeneScan for two co-occurring plants of 
P, mollis subsp. nectens. Shown is the profile for M-CAC and E-ACG for fragments between 195 and 
280 bases in length. The size standards (200 and 250 bases) are shaded peaks. Note the polymorphisms 
at 213. 225, 228, 241 and 275 bases (arrowed). 



Polymorphism varied markedly for different primer pairs. For example, one of the most 
polymorphic primer pairs (M-CAG and E-ACA) produced 42 polymorphic fragments (52% 
of all fragments) between species, 25 (33%) between subspecies, 15 (28%) between 
individuals within the one population of P. mollis, and 10 (15.6%) between a maternal plant 
and one of its seeds. The least variable primer pair (M-CAT and E-AAG) produced no 
polymorphism (Table 2). The diminishing marginal information return, when adding primer 
pairs in order of decreasing levels of polymorphism, is slight for the species and subspecies 
comparison, and more pronounced for the within-population and mother-offspring 
comparisons (Fig. 3), highlighting the greater difficulty of finding markers for paternity 
analysis. Thus, in our data set, 40 primer pairs account for almost all polymorphic fragments 
between two individuals of P. mollis co-occurring within a single population, and the seven 
most polymorphic primer pairs generated over 100 polymorphic fragments. 

The most informative primer pairs from the inter-species comparison were M-CTG and 
E-ACG (28 polymorphic fragments, 33% of all fragments), M-CAG and E-ACA 
(42 fragments, 52%), M-CTG and E-ACT (40 fragments, 34%), and M-CAC and E-ACT 
(34 fragments, 30%) (Table 2). The most informative primer pairs from the inter-subspecies 
comparison were M-CTG and E-ACT (29 fragments, 27%), M-CAG and E-ACA 
(25 fragments, 33%), M-CTA and E-AGG (25 fragments, 24%), and M-CAC and E-ACT 
(23 fragments, 21%). The most informative primer pairs from the within-population 
comparison were M-CTC and E-ACG (16 fragments, 19 %), M-CAG and E-ACA 
(15 fragments, 23%), M-CTA and E-AGG (15 fragments, 15%), M-CTG and E-AGG 
(15 fragments, 17%) and M-CAC/E-ACG (13 fragments, 20%). The most informative primer 
pairs from the mother/seed comparison were M-CTC and E-ACG (13 fragments, 16%), 
M-CTG and E-AGG (13 fragments, 14%), M-CTG and E-ACT (11 fragments, 10%) and 
M-CAC and E-ACG (10 fragments, 15%) (Table 2). Therefore, different sets of primer pairs 
may be the most informative at different taxonomic levels. 

In a population of 14 plants of P. mollis subsp. nectens at Sublime Point, NSW, 
42 polymorphic fragments were scored from profiles generated by the AFLP primer pair 
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Table 2. Number of polymorphic fragments (# polys), total number of fragments (# frags) and 
percentage of all fragments polymorphic (% polys) for each of 64 AFLP primer pair between one 
individual of P. mollis subsp. nectens and (i) one individual of P. levis (between species), (ii) one 
individual of P. mollis subsp. livens (between subspecies), (Hi) one co-occurring individual of P. mollis 
subsp. nectens (within population), and (iv) one progeny (seed) 



X-specfcs x-$ufrsp wjtfiin PQP mother/seed 

##%##%##%##% 
frags polys polys frags polys polys frags polys polys frags polys polys 



m-primer e-primer 



CAA 


AAC (Y) 


56 


14 


25 


56 


10 


17.8 


52 


4 


7.7 


50 


2 


4 




AAG (G) 


64 


11 


17.2 


71 


10 


14.1 


62 


3 


4.8 


62 


2 


3.2 




ACA (B) 


63 


21 


33.3 


56 


10 


17.9 


51 


2 


3.9 


50 


2 


4 




ACC (Y) 


54 


20 


37 


54 


13 


24.1 


52 


5 


9.6 


48 


2 


4.2 




ACG (G) 


46 


15 


32.6 


46 


12 


26.1 


43 


4 


9.3 


40 


1 


2.5 




ACT (B) 


99 


-23 


23.2 


99 


14 


14.1 


85 


8 


9.4 


84 


3 


3.6 






A< 
40 


Q 

y 


zu 


A*) 

4z 




*7 1 

1. 1 


$y 


1 


2.6 


39 


2 


5.1 




AGG (G) 


104 


18 


17.3 


89 


8 


9 


92 


7 


7.6 


90 


4 


4.4 


CAC 


AAC (Y) 


38 


3 


7.9 


36 


3 


8.3 


36 


0 


0 


36 


0 


0 




AAG (G) 


91 


23 


25.3 


85 


12 


14.1 


84 


5 


5.9 


83 


2 


2.4 




ACA (B) 


81 


11 


13.6 


75 


7 


9.3 


73 


4 


5.5 


73 


6 


8.2 




ACC (Y) 


91 


25 


27.5 


94 


21 


22.3 


84 


12 


14.3 


87 


9 


10.3 




ACG (G) 


77 


18 


23.3 


66 


13 


19.7 


66 


13 


19.7 


67 


10 


14.9 




ACT (B) 


112 


34 


30.4 


111 


23 


20.7 


104 


10 


10.4 


101 


7 


6.9 






AQ 

49 


1U 


on a 
ZU.4 


AO 

4o 


I 


1 A C 

14.0 


49 


4 


8.2 


48 


1 


2.1 




AGG (G) 


97 


32 


33 


96 


20 


20.8 


88 


7 


7.9 


85 


5 


5.9 


CAG 


AAC (Y) 


74 


24 


32.4 


74 


14 


18.9 


65 


7 


10.7 


65 


3 


4.6 




AAG (G) 


82 


11 


13.4 


84 


8 


9.5 


82 


6 


7.3 


78 


2 


2.5 




ACA (B) 


81 


42 


51.9 


75 


25 


33.3 


66 


15 


22.7 


64 


10 


15.6 




ACC (Y) 


119 


26 


21.8 


106 


11 


9.6 


110 


7 


6.4 


107 


4 


3.7 




ACG (G) 


63 


32 


50.8 


60 


20 


33.3 


55 


8 


14.5 


51 


1 


2 




ACT (B) 


127 


32 


25.2 


109 


22 


20.2 


102 


7 


6.9 


100 


4 


4 




AGC (Y) 


91 


21 


23.1 


82 


11 


13.4 


86 


7 


8.1 


79 


4 


5.1 




AGG (G) 


97 


17 


17.5 


92 


9 


9.8 


93 


6 


6.5 


93 


9 


9.7 


CAT 


AAC (Y) 


38 


2 


5.3 


38 


2 


5.3 


37 


1 


2.7 


37 


1 


2.7 




AAG (G) 


40 


0 


0 


40 


0 


0 


40 


0 


0 


40 


0 


0 




ACA (B) 


90 


27 


30 


89 


14 


15.7 


83 


5 


6 


83 


2 


2.4 




ACC (Y) 


42 


18 


42.8 


36 


8 


22.2 


37 


6 


.16.2 


34 


5 


14.7 




ACG (G) 


63 


13 


20.6 


64 


14 


21.9 


59 


5 


8.5 


59 


3 


5.1 




ACT (B) 


73 


10 


13.7 


70 


4 


5.7 


71 


2 


2.8 


74 


2 


2.7 




AGC (Y) 


78 


23 


29.5 


75 


17 


22.7 


68 


6 


8.8 


68 


6 


8.8 




AGG (G) 


64 


13 


20.3 


57 


3 


5.3 


56 


1 


1.8 


54 


1 


1.8 


CTA 


AAC (Y) 


44 


2 


4.5 


44 


4 


9.1 


43 


1 


2.3 


42 


1 


2.4 




AAG (G) 


83 


20 


24.1 


79 


15 


19 


76 


9 


11.8 


75 


10 


13.3 




ACA (B) 


84 


11 


13.1 


77 


7 


9.1 


77 


2 


2.6 


75 


2 


2.7 




ACC (Y) 


71 


12 


16.9 


67 


12 


17.9 


60 


3 


5 


62 


3 


4.8 




ACG (G) 


69 


23 


33.3 


66 


18 


27.3 


58 


7 


12.1 


61 


4 


6.6 




ACT (B) 


93 


18 


19.4 


92 


13 


14.1 


85 


4 


4.7 


86 


3 


3.5 




AGC (Y) 


68 


7 


10.3 


67 


8 


11.9 


69 


7 


10.1 


69 


7 


10.1 




AGG (G) 


109 


28 


25.7 


106 


25 


23.6 


102 


15 


14.7 


99 


8 


8.1 



540 



S. L. Krauss and R. Peakall 



Table 2. (continued) 



X-gPfffcs . iiSUbSB Within POP mother/seed 

##%##%##%##% 
frags polys polys frags polys polys frags polys polys frags polys polys 



CTC 


AAC (Y) 


92 


22 


23.9 


85 


9 


10.6 


83 


6 


7.2 


81 


5 


6.2 




AAG (G) 


68 


5 


73 




5, 


7 9 




u 


n 
u 


oy 




4.3 




ACA (B) 


114 


19 


16.7 


111 


12 


10.8 


108 


9 


8.3 


112 


6 


5.3 




ACC(Y) 


56 


4 


7.1 


56 


4 


7.1 


55 


3 


5.5 


54 


2 


3.7 




ACG (G) 


78 


28 


35.9 


81 


22 


27.2 


84 


16 


19 


80 


13 


16.2 




ACT (B) 


71 


23 


32.4 


62 


13 


20.9 


54 


6 


11.1 


55 


5 


9.1 




AGC (Y) 


55 


16 


29.1 


54 


12 


22.2 


51 


5 


9.8 


47 


2 


4.2 




AGG (G) 


100 


23 


23 


88 


7 


7.9 


90 


5 


5.6 


90 


3 


3.3 


CTG 


AAC (Y) 


69 


8 


11.6 


72 


7 


9.7 


61 


2 


3.1 


61 


1 


1.6 




t\/\\j \\j ) 


AO. 

ou 


IS 


in 

JU 


o / 


y 


ICO 

15.0 


57 


3 


5.3 


57 


4 


7 




ACA (B) 


73 


22 


30.1 


70 


19 


27.1 


69 


8 


11.6 


71 


8 


11.3 




r\\_V_ ^ I ^ 


jy 


£ 

o 


1 A 1 
10.1 


59 


3 


5.1 


59 


3 


5.1 


58 


3 


5.2 




ACG (G) 


69 


28 


40.6 


59 


8 


13.6 


59 


5 


8.5 


59 


4 


6.8 




ACT (B) 


118 


40 


33.9 


109 


29 


26.6 


108 


15 


13.9 


110 


11 


10 




AGC (Y) 


48 


9 


18.7 


47 


11 


23.4 


43 


0 


0 


43 


1 


2.3 




AGG (G) 


95 


27 


28.4 


91 


18 


19.8 


90 


15 


16.7 


90 


13 


14.4 


CTT 


AAC (Y) 


25 


4 


16 


22 


1 


4.5 


22 


1 


4.5 


22 


1 


4.5 




AAG (G) 


44 


7 


15.9 


41 


4 


9.7 


40 


0 


0 


41 


1 


2.4 




ACA (B) 


71 


24 


33.8 


70 


16 


22.9 


72 


10 


13.9 


67 


6 


9 




ACC (Y) 


90 


29 


32.2 


86 


16 


18.6 


78 


10 


12.8 


77 


4 


5.2 




ACG (G) 


61 


23 


37.7 


50 


6 


12 


53 


6 


11.3 


53 


6 


11.3 




ACT (B) 


65 


20 


30.7 


65 


16 


24.6 


60 


6 


10 


61 


5 


8.2 




AGC (Y) 


38 


18 


47.4 


38 


9 


23.7 


33 


3 


9.1 


32 


0 


0 




AGG (G) 


93 


22 


23.7 


94 


17 


18.1 


89 


8 


9 


91 


5 


5.5 




Total 4722 


1164 


24.65 


4509 


743 


16.48 


4323 


371 


8.58 


4279 


265 


6.19 





1200 




0 20 40 60 80 



number of primer pairs 

Fig. 3. Plot of cumulative number of polymorphic fragments against number of AFLP primer pairs 
(from most to least polymorphic), for each of four comparisons: a single plant of P. mollis subsp. 
nectens from Sublime Point and (i) its seed ('mother-offspring'), (ii) a co-occurring plant ('population*), 
(iii) a single plant of P. mollis subsp. livens ('subspecies*), and (iv) a single plant of P. levis ('species'). 
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M-CAC and E-ACG. The frequency of each of these fragments within the population varied 
from 0.07 (i.e. the fragment unique in the population) to 0.93 (i.e. the fragment possessed by 
all but one plant) (Fig. 4). The mean frequency of the recessive phenotype (absence of a 
fragment, <f) over all 42 fragments was 0.598, and the mean frequency of the recessive allele 
(<?), assuming Hardy-Weinberg equilibrium, was 0.773 (s.e. = 0.07). 



1.00 




fragment (1 to 42) 
Fig. 4. The frequency of each of 42 polymorphic fragments 
generated by the AFLP primer pair M-CAC and E-ACG for 
14 plants in a natural population of Persoonia mollis subsp. 
nectens at Sublime Point, NSW. 



Discussion 

AFLP for Paternity Analysis 

Are AFLPs sufficiently variable to assign paternity unambiguously in natural populations 
of P. mollisl Using the formula described in Lewis and Snow (1992, p. 156), we estimated 
the fraction of offspring for which paternity can be assigned unambiguously to a single 
individual (^.). The variables included the number of plants contributing to the pollen pool, 
the frequency of the recessive allele (assumed to be constant over all loci), and the number of 
polymorphic loci assayed. In the pollination experiment being conducted by us to address the 
intensity of post-pollination selection, 12 out of 14 possible fathers needed to be excluded 
(the self-male was excluded, as P. mollis is essentially self-incompatible). Preliminary studies 
of population variation generated by one AFLP primer pair (M-CAC and E-ACG) yielded an 
estimated recessive phenotype frequency per locus (q 2 ) of 0.598, and an estimated recessive 
allele frequency (q) of 0.773. Given these estimated frequencies, 50 loci are expected to 
exclude all but one potential father for 88.0% of all seeds, 75 loci will allow unambiguous 
paternity assignment of 98.1% of all seeds, while 100 loci will account for 99.7% of all seeds 
in the study population. This preliminary study has already produced over 42 polymorphic 
loci from just one AFLP primer pair for a population of 14 plants. It is therefore likely that 
well over 100 polymorphic loci will ultimately be generated using as few as three AFLP 
primer pairs. 

These estimates of P EJ involve a number of assumptions (Lewis and Snow 1992) 
including, as calculated here, that all loci have a recessive allele frequency of 0.77. As more 
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polymorphic loci axe identified, we will preferentially be able to use those loci with a 
recessive allele frequency greater than 0.77, increasing the power of P^ A second 
assumption is that the potential fathers in the pollen pool are not closely related. The 
paternity exclusion probability (P^) is exaggerated when close relatives (e.g. sibs) compete 
for paternity (Double et al. 1997). In plants, the level of consanguineous matings in natural 
populations can be estimated using the effective selfing model of Ritland and Jain (1981). 
This model separates the estimate of selfing into two components; true selfing and effective 
selfing, due to matings with close relatives. Effective selfing will inflate the estimate of 
selfing. Persoonia mollis is consistently completely outcrossing (Krauss 1994c), so selfing 
(and therefore effective selfing) is negligible. Thus, population genetic structure of P. mollis, 
at the scale of the paternity pool (Levin 1988), does not lead to consanguineous matings, 
probably due to high seed dispersal, high pollen dispersal arid extremely high seed mortality. 

Advantages ofAFLPs 

Where allozyme analysis produced identical phenotypes for two co-occurring individuals 
of P. mollis at Sublime Point (Krauss 1997), the 64 AFLP primer pairs produced 371 . 
polymorphic fragments to distinguish between them. At the population level, where 14 alleles 
were detected at 11 co-dominant allozyme loci (Krauss 1997), 42 dominant loci (each with 
two alleles, presence or absence) were scored for one AFLP primer pair. Thus, the AFLP 
method has the power to distinguish between genotypes, even in populations where 
allozymes show very low levels of genetic diversity. Consequently, AFLP may be 
particularly useful for ecological genetic studies within the Proteaceae, which generally 
appears to show unusually low levels of allozyme variability (Krauss 1997). Similarly, AFLP 
produces many more polymorphic loci per primer than RFLPs, SSRs and RAPDs in the 
predominantly selfing Glycine max (Maughan et al 1996). 

Being a PCR technique, the AFLP method can routinely provide a high throughput After 
an initial (time-consuming) screening period, where the best (i.e. most polymorphic) primer 
pairs are identified by laboriously scoring profiles for all primer pairs, the scoring of 
individual polymorphic fragments for presence or absence can be achieved rapidly. Using the 
protocol set out in this study, we envisage running and scoring 100 individuals for 100 
polymorphic loci per week. Thus, the AFLPS method is ideally suited to ecological genetic 
studies, where many samples are required. 

The use of fluorescently labelled primers, and fluorescent detection using an automatic 
sequencer and GeneScan software, has a number of major advantages over radioactive 
labelling. First, the safety concerns associated with radioactive chemicals are avoided. 
Second, the output is available at the end of a gel run rather than waiting many hours for 
radioactive incubation. Third, digitally modified data are saved onto the computer as 
fragments pass the detector and scoring of electropherograms is more efficient and accurate 
than scoring autoradiographs by eye, particularly as a size-standard is included in each lane 
for accurate sizing of all fragments. Ultimately, scoring of polymorphic fragments will be, 
automated, using software developed specifically for this system. Fourth, by using differently 
coloured fluorescent primers, one can multiplex PCR products in a single lane, reducing the 
number of gels required and making the system cost effective per polymorphic fragment For 
example, by combining PCR products from the three most polymorphic primer pairs, it 
should ultimately be possible to generate over 100 polymorphic loci for small natural 
populations of P. mollis in only one gel lane. 

Limitations ofAFLPs 

One problem requiring further work is that of rare disappearing fragments, especially where 
fragments are apparently polymorphic. For example, for the primer pair M-CAA and E-AAG, 
we scored an apparently clear polymorphic fragment of 158 bases in length. However, a 
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subsequent replicate run (fresh extraction, repeat AFLP protocol) failed to produce this 
fragment This could be due to incomplete initial digestion of DNA, perhaps the most common 
source of artifactual polymorphism (Lin and Kuo 1995), poor amplification of this fragment 
during PCR, or contamination of DNA. Rare disappearing fragments have been highlighted 
elsewhere (Stanfleld et al 1996), and spermidine has been suggested as an addition to the 
digestion mixture (Bloch and Grossmann 1987) to alleviate the problem. We have yet to test 
this in our laboratory. On one occasion, we generated a surprisingly variable profile for one 
seed that was clearly the product of incomplete digestion, either due to poor DNA quality or 
insufficient restriction enzymes. Extreme care is needed at this stage of the procedure, and it is 
essential to confirm that all digestion has been complete before ligation. However, AFLPs 
appear to be robust to variation in DNA concentration, and Lin and Kuo (1995) found no 
differences in profiles over a concentration range of 100 ng to 5 fig of genomic DNA. 

In our study, the AFLP method produces dominant bi-allelic markers, scored as band 
presence or absence. Although dominant markers are not nearly as efficient as co-dominant 
markers for this sort of analysis (Lewis and Snow 1992), this problem is largely offset for 
AFLP by the very large number of fragments generated (potentially over 100 polymorphic 
fragments per lane). Note, though, that AFLP markers can potentially be co-dominant 
particularly where fragments contain SSRs. However, finding the complementary alleles wili 
on most occasions be very difficult without appropriate crosses and analysis of F and F 
generations. 1 2 

One downside to AFLP analysis is the cost per sample. The most expensive step is the 
ligation of adapters onto restricted fragments, which is common to both radioactive and 
fluorescent methods. This step costs about $A5 per sample using the available kits and 
recommended amounts of mix components. Fluorescent primers, while expensive to 
purchase, are cost effective per analysis and much cheaper than radioactive labels. 
Ultimately, the cost per sample will depend on the number of primer pairs required for the 
assignment of paternity to each seed. We estimate that if three primer pairs are required to 
produce sufficient numbers of polymorphic loci for assignment of paternity, and can be 
multiplexed at three per lane, the cost per sample will be about $A10 for commercially 
available kits from Life Technologies or Perkin Elmer. The cost can be further reduced by 
combining the AFLP selective amplification for three primer pairs (differently labelled) into 
the one, rather than three, PCR reaction well. This reduces the amount of reagents needed, 
and has been successfully performed with other species in our laboratory. 

A major limitation of fluorescent AFLPs for most researchers will be ready access to an 
Automated Genetic Analysis System. However, the number of laboratories with dedicated access 
is increasing, as are the number of commercial laboratories that will perform the electrophoresis. 

Beyond AFLP 

The AFLP method is of value for studies where highly polymorphic genetic markers are 
required because it provides a stepping stone to the access of highly variable SSR loci, with 
potentially reduced development costs. A new technique, called SAMPL (Selective 
Amplification of Microsatellite Polymorphic Loci), uses the AFLP procedure as a starting point 
to find SSR loci within AFLP-generated fragments (Witsenboer et al. 1997). In the final PCR 
amplification, a fluorescently labelled SSR-anchored primer (e.g. 5'^CCGTGTGTGTGTGT-3') 
is combined with one of the two AFLP primers to amplify AFLP-generated fragments that 
contain SSRs. Alternatively, inter-SSR regions can be targeted within the AFLP generated 
fragments, by using a primer with a 5' SSR sequence (e.g. 5 -AGAGAGAG AGAGAGAGT-3 ') 
in combination with one of the two AFLP primers for the final PCR amplification. In theory, this 
method has the potential to visualise co-dominant SSR loci within a multilocus profile. 

We have performed preliminary studies of the SAMPL procedure on P. mollis with 
promising results. In a comparison of three individuals of />. mollis from the Sublime Point 
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population, using nine SAMPL primers, we have generated multiple fragments and putative 
polymorphisms. If required, these fragments can be sequenced to confirm that they contain 
SSRs. Alternatively, progeny arrays can be run to enable the delineation of co-dominant loci. 

Conclusion 

This study has demonstrated that the AFLP method produces large numbers of 
polymorphic fragments within small natural populations of P. mollis and when contrasting 
two individuals from (i) two Persoonia species, (ii) two P. mollis subspecies, (iii) the same 
P. mollis subsp. nectens population, and (iv) a P. mollis plant and its seed. More specifically, 
the AFLP method produces sufficient polymorphism for the potentially unambiguous 
assignment of paternity in natural populations of P. mollis. In combination with appropriate 
experimental manipulations of natural populations, the AFLP method will allow us to address 
intriguing, and previously intractable, issues in evolutionary biology, such as the intensity of 
post-pollination selection and its effect on male reproductive success. More generally, 
because the AFLP method can be applied to any species, it can be recommended as an 
extremely powerful tool for ecological genetic studies and studies of genetic diversity within 
and among natural plant populations, particularly where other markers such as allozymes 
show little or no variation. 
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Summary 

The potential use of the Randomly Amplified Polymorphic DNA (RAPD) technique for characterization and 
assessment of genetic relationships was investigated in nineteen walnut (Juglans regia L.) genotypes used as 
parents or released as cultivars from the breeding program of the University of California at Davis. Most of the 
72 decamer primers used yielded scorable amplification patterns based on discernable bands. The results obtained 
produced a unique fingerprint for each of the walnut genotypes studied. Cluster analysis separated the 19 walnut 
genotypes into two main groups whose differences were related to their pedigree. Genotypes sharing common 
parents tend to group together and with at least one of the parents. Thus, RAPD markers can detect enough 
polymorphism to differentiate among walnut genotypes, even among closely related genotypes, and the genetic 
similarity based on RAPDs appears to reflect the known pedigree information. RAPD technology can be useful 
in current walnut breeding programs, allowing the identification of new cultivars as well as the assessment of the 
genetic similarity among genotypes which will help in selecting the best parents to obtain new genetic combinations. 



Introduction 

The family Juglandaceae consists of seven genera com- 
prising about 60 monoecious tree species. The genus 
Juglans contains about 20 species, all producing edi- 
ble nuts. Among those, the English or Persian walnut 
(Juglans regia L.) is the most widely cultivated species 
(McGranahan & Leslie, 1990). Persian walnuts have 
been grown in California since the day of early Spanish 
missions. In the last 50 years the genetic base of this 
crop has been enriched with introductions from Asia 
and Europe, primarily France (Forde & McGranahan, 
1996). All the commercial cultivars currently grown 
in California can be considered descendants, at least 
partially, from those gene pools and have originated 
either as chance seedlings, or from the breeding pro- 
gram of the University of California (Serr, 1969). This 
breeding program, based on crosses between late sea- 
son and laterally fruiting genotypes, has released 15 
walnut cultivars (Tulecke & McGranahan, 1994). The 



characterization of these genotypes is important for 
designing objective and repeatable criteria to protect 
breeders' right in newly developed cultivars. 

Accurate and rapid cultivar identification is espe- 
cially important in vegetatively propagated plant 
species such as most fruit trees both for practical 
breeding purposes and for proprietary rights protection. 
Unfortunately, the traditional methods for characteri- 
zation and assessment of genetic variability in perenni- 
al fruit crop species, based on morphological, physio- 
logical and biochemical studies, are both time consum- 
ing and affected by the environment. The introduction 
of molecular biology techniques, such as DNA-based 
markers, provides an opportunity for genetic charac- 
terization that allows direct comparison of different 
genetic material independent of environmental influ- 
ences (Weising et al., 1995). 

Initial molecular studies in walnut were carried 
out using isozymes to assess the inheritance of some 
enzyme systems (Arulsekar et al., 1986; Aleta et 
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Figure 1. Pedigree diagram of the cultivars tested. 

al., 1993). Isozymes have also been used in Juglans 
to identify genetic variability (Malvolti et al., 1993, 
1994), to detect interspecific hybrids (Arulsekar et al., 
1985; McGranahan et al., 1986; Germain et al., 1993), 
to identify species and cultivars (Louskas et al., 1984; 
Wenheng, 1984; Arulsekar et al., 1985; Aleta et al, 
1989; Germain et al., 1993; Solar et al., 1993, 1994) 
and to assess mating parameters (Rink et al., 1994). 
RFLP markers have also been used in walnuts to deter- 
mine parentage (Aly et al., 1992), to establish phylo- 
genetic relationships in the genus Juglans (Fjellstrom 
& Parfitt, 1995), to estimate genetic diversity, and to 
identify cultivars (Fjellstrom et al., 1994; Fjellstrom 
& Parfitt, 1994a, b). Recently, RAPD markers have 
been used to evaluate the level of polymorphism at the 
interspecific level between Persian walnut (/. regia) 
and Northern California black walnut (J. hindsii (Jeps.) 
Jeps.) (Woeste et al., 1996a) and to identify a marker 
linked to hypersensitivity to the cherry leafroll virus 
(Woeste et al, 1996b). 

Randomly Amplified Polymorphic DNA (RAPD) 
(Williams et al, 1990, 1993) can be of further use to 
identify closely related walnut cultivars and could com- 
plement the results previously obtained with RFLPs, 
mainly because of the higher level of polymorphism 
obtained with RAPDs. RAPD analysis has already 
proven to be valuable in genotype characterization as 
well as in population and pedigree analyses in many 
crop species and studies in tree crops are starting to 
produce interesting results (Hormaza et al, 1994; Fab- 
bri et al, 1995). RAPD markers are not only important 
for the characterization of the germplasm but can also 
be used to evaluate the effects of selection over time 



Table L 


Walnut genotypes included in this study 


Code 


Genotype 


Original source 1 


1 


56-224 


Univ. of California 


2 


Chandler 


Univ. of California 


.3 


Cisco 


Univ. of California 


4 


Conway Mayette 


France' 


5 


Franquette 


France 


6 


Howard 


Univ. of California 


7 


Lompoc 


Univ. of California 


8 


Marchetti 


California 


9 


Meylan 


France 


10 


Payne 


California 


11 


Pedro 


Univ. of California 


12 


PI-159568 


Afghanistan 


13 


Serr 


Univ. of California 


14 


Sharkey 


China 


15 


Sunland 


Univ. of California 


16 


Tehama 


Univ. of California 


17 


Tulare 


Univ. of California 


18 


Vina 


Univ. of California 


19 


Waterloo 


California 



1 Based on Tulecke & McGranahan, 1994. 



and to aid in the development of crossing schemes 
in walnut improvement programs since this method 
allows the study of the genetic diversity of the avail- 
able germplasm. 

The main objective of this study was to develop 
RAPD markers to unequivocally characterize 19 close- 
ly related walnut genotypes from the breeding program 
of the University of California as well as to compare the 
degree of genetic relatedness obtained through RAPD 
analysis with that of the expected results obtained from 
pedigree data. 



Materials and methods 

Plant material 

The nineteen genotypes used in this study (Table 1 and 
Figure 1) were obtained from the walnut collection 
maintained at the University of California Wolfskill 
Experimental Orchard in Winters, California, USA, 
and are part of the walnut breeding program developed 
at this institution. They were selected based on a pedi- 
gree assessment and include recently released cultivars 
as well as their parental genotypes. 



DNA isolation 
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Young leaves from nineteen accessions of/, regia were 
collected in spring and immediately stored at - 70 °C 
prior to DNA extraction. Total DNA was extracted 
following the method of Doyle & Doyle (1987) with 
minor modifications. Seven grams of leaf tissue were 
ground to fine powder in liquid nitrogen and added to 
20 ml of 65 °C preheated CTAB buffer (2% CTAB, 1% 
PVP, 1% /?-mercaptoethanol, 0.1% sodium bisulfite, 
1.4 M NaCl, 100 mM Tris-HCl pH 8.0, 20 mM sodium 
EDTA) and incubated at 65 °C for 60 min. The lysate 
was extracted with 20 ml of chloroform/isoamyl alco- 
hol (24 : 1) and centrifuged for 10 min at 1800 rpm in 
a desktop cehtrifuge. In order to precipitate the nucleic 
acids, the aqueous fraction was mixed with an equal 
volume of cold isopropanol. The nucleic acid precipi- 
tate was recovered with a glass hook, washed in 76% 
ethanol with 10 mM ammonium acetate and air dried 
overnight before being resuspeiided in 1 ml TE buffer 
(10 mM Tris-HCl pH 8.0, 1 mM disodium EDTA). 
The resuspended DNA was treated with RNAase and 
the concentration of extracted DNA was determined 
using a spectrophotometer at 260 nm. DNA was dilut- 
ed to 10 ng/jxl and used for PCR amplification. 

DNA amplification and electrophoresis conditions 

PCR amplification reactions were carried out as 
described in Williams et al. (1990) with minor modi- 
fications. Reaction mixtures (25 p\ total volume) con- 
sisted of 50 ng of template DNA, 10 mM Tris-HCl, 
pH 8.3, 50 mM KC1, 1.9 mM Mgd 2 , 0.001% gelatin, 
100 fM each of dATP, dGTP, dCTP and dTTP (Perkin- 
Elmer-Cetus, Norwolk, Conn.), 0.4 M primer (Oper- 
on Technologies, Alameda, Calif.) and 0.75 units of 
Taq DNA polymerase (Perkin-Elmer-Cetus, Norwolk, 
Conn.; Promega, Madison, Wise; Life Technologies, 
Gaithersburg, Maryl.) overlaid with 25 pi of miner- 
al oil. DNA amplification was carried out in a DNA 
Thermal Cycler 480 (Perkin-Elmer-Cetus, Norwolk, 
Conn.) programmed for 1 cycle of 2 min at 94 °C 
followed by 40 cycles of 45 sec at 94 °C, 1 min at 
38 °C and 2 min at 72 °C. After a final incubation for 
5 min at 72 °C the samples were stored at 4 °C prior to 
analysis. The PCR amplified products were resolved 
by gel electrophoresis in 2.2% Sea Kem agarose (FMC, 
Rockland, Maine), in TBE buffer (45 mM Tris, 45 mM 
H3BO4, 1 mM EDTA). Gels were stained in ethidium 
bromide (0.5 pi/ml) and then visualized on a long-wave 
UV light source. 



12 20 21 

Figure 2. Gel electrophoresis of OPK-19 primer. Lane 1 and 20: 
.123 bp DNA ladder. Lane 2-19: cultivars tested. 

Data analysis 

Amplified bands were visually scored as present or 
absent. Each amplification fragment useful for discrim- 
ination between genotypes was named by the source 
of the primer (OP = Operon), the kit letter, the primer 
number and its approximate size in base pairs. A sim- 
ilarity matrix was generated using the Nei and Li sim- 
ilarity index (Nei & Li, 1979; Lamboy, 1994) based 
on the proportion of shared amplification fragments 
between two genotypes according to the following 
equation: 

Similarity = 2Nab/(Na + Nb) 

where Nab is the number of scored amplification frag- 
ments with the same molecular weight shared between 
genotypes 'a' and 'b'; Na is the number of scored 
amplification fragment in genotype 'a'; and Nb is the 
number of scored amplification fragments in genotype 
<b\ 

A dendrogram was constructed based on the simi- 
larity matrix data by applying unweighted pair group 
method with arithmetic averages (UPGMA) cluster 
analysis using the NTSYS-pc computer program ver- 
sion 1.70 (Exeter Software, Setauket, New York). 



Results and discussion 

A total of 72 decamer primers were used to amplify 
DNA extracted from the nineteen walnut genotypes 
used in this study. Almost all the primers yielded 
scorable amplification patterns (Figure 2). Some of 
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Table 2. Distribution of 23 RAPD markers within the 19 walnut genotypes. Genotype numbers correspond to those in Table 
1. V indicates presence and indicates absence of the marker 
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the primers, however, produced either no amplification 
or unreadable gel smears. Eighteen primers, produc- 
ing 1-2 polymorphic fragments each, generated poly- 
morphic banding patterns among the genotypes stud- 
ied (Table 2). The total number of polymorphic bands 
obtained was 23. The apparent low level of polymor- 
phism detected (about 25% of the primers tested) can 
be explained by the strict criterion adopted to score 
the markers. Only the conspicuous intensely stained 
bands between 250 bp and 1700 bp long were consid- 
ered for analysis. Each RAPD analysis was repeated 
in separate experiments at least twice, and only hihgly 
reproducible markers were considered. 

Some questions have been raised about the reli- 
ability of RAPD data due to their variable nature 
under different experimental conditions and by the 
fact that comigrating bands from different individuals 
do not necessarily represent homologous amplification 
products (Newbury & Ford-Lloyd, 1993; Bachmann, 
1994). However, fragment size can be considered a 
reliable predictor of homology among closely related 
individuals, as is the case in this study, although this is 



not necessarily true at higher taxonomic levels (Riese- 
berg, 1996). In order to maximize the reliability of 
the process, the reproducibility of the results obtained 
was tested in two different ways. First, the amplifica- 
tion patterns obtained with three different Taq DNA 
polymerases was verified using the same genotype and 
primer (data not shown). Second, amplification reac- 
tions with DNA obtained from different accessions of 
four genotypes ('Chandler', 'Cisco', 'Howard' and 
'Franquette') using two different primers were also 
compared (data not shown). In both cases, the pattern 
of amplification was fully reproducible. 

The results obtained, using eighteen primers that 
yield 23 polymorphic RAPD bands (Table 2), pro- 
duced a unique fingerprint for each of the 19 walnut 
genotypes included in this study (Table 1) allowing a 
unequivocal identification of each genotype. Besides, 
the fingerprint of each genotype is defined by multiple 
RAPD bands presumably at multiple genetic loci; This 
is important for cultivar characterization since each 
cultivar is not defined by a single marker but by a set 
of several markers. This high level of polymorphism 



Table 3. Table ^-Similarity matrix generated using the Nei and Li's index. Cultivar numbers correspond to those in Table 1 



1 2 3 4 5 6 7 8 9 10 11 12 . 13 14 15 16 17 18 19 

1 1.00 

2 0.79 1.00 _ ....... 

3 0.52 0.43 1.00 

4 0.62 0.62 0.58 1.00 

5 0.52 0.43 0.78 0.58 1.00 

6 0.73 0.73 0.35 0.43 0.23 1.00 

7 0.45 0.45 0.59 0.52 0.71 0.37 1.00 

8 0.78 0.61 0.67 0.58 0.56 0.59 0.47 1.00 

9 0.58 0.50 0.74 0.64 0.63 0.33 0.56 0.74 1.00 

10 0.64 0.54 0.47 0.43 0.59 0.50 0.75 0.59 0.44 1.00 

11 0.54 0.69 0.67 0.74 0.57 0.40 0.50 0.48 0.54 0.50 1.00 

12 0.72 0.64 0.30 0.31 0.30 0.53 0.32 0.50 0.38 0.53 0.26 1.00 

13 0.75 0.58 0.42 0.48 0.42 0.44 0.44 0.74 0.60 0.67 0.36 0.76 1.00 

14 0.69 0.54 0.09 0.52 0.29 0.50 0.20 0.38 0.27 0.40 0.33 0.61 0.54 1.00 

15 0.67 0.58 0.63 0.48 0.53 0.56 0.56 0.53 0.60 0.44 0.36 0.67 0.60 0.36 1.00 

16 0.38 0.48 0.50 0.45 0.50 0.40 0.80 0.37 0.47 0.67 0.53 0.22 0.35 0.21 0.35 1.00 

17 0.72 0.72 0.60 0.54 0.50 0.53 0.63 0.80 0.67 0.63 0.52 0.76 0.35 0.57 0.67 1.00 

18 0.45 0.54 0.59 0.35 0.59 0.37 0.75 0.47 0.44 0.75 0.60 0.32 0.44 0.20 0.33 0.80 0.74 1.00 

19 0.58 0.42 0.63 0.64 0.74 0.33 0.89 0.63 0.70 0.67 0.54 0.29 0.50 0.36 0.50 0.71 0.67 0.67 1.00 



probably reflects the outcrossing nature of walnut since 
similar results have been obtained with RAPDs in other 
outcrossing fruit and nut tree species such as pistachio 
(Hormaza et al, 1994) or olive (Fabbri et al, 1995). 

As expected, most of the fragments amplified from 
DNA obtained from the progenies were also present 
in the parents. Nevertheless, three markers (K01-445, 
G16-700 and R15-490) were, in some cases, present in 
the progeny and absent in both parents. Thus, K01 -445 
was present in * Vina' and absent in 'Payne' and Tran- 
quette'; G16-700 was present in 'Sunland' and absent 
in PI- 159568 and 'Lompoc'; R15-490 was present in 
'Sen' and absent in 'Lompoc' and TI-159568'. The 
occurrence of non-parental bands has been reported 
in previous studies with RAPDs (Hunt & Page, 1992; 
Riedy et al, 1992; Aruna et al., 1993; Ayliffe et al, 
1994; Pooler & Scorza, 1995) and different explana- 
tions have been suggested, such as formation of het- 
eroduplex molecules between alternate RAPD alleles, 
mutations or recombination events within the primer 
binding sites or inside the amplified fragments, com- 
petition for primer binding sites or somatic rearrange- 
ments in perennial plants. 

The similarity values based on 23 RAPDs (Table 3) 
ranged from 0.09 for 'Cisco* and 'Sharkey' to 0:89 for 
'Lompoc' and 'Waterloo'. UPGMA cluster analysis 
°f the similarity matrix (Figure 3) separated the wal- 



nut genotypes included in this study into two groups 
whose differences were basically related to the orig- 
inal sources of the genotypes used as parents in the 
breeding program (Figure 1). The first group compris- 
es 'Sharkey' and 'PI-159568', two genotypes origi- 
nating from Asia, and some of their progeny: '56- 
224', 'Chandler', 'Howard', 'Serr' and 'Sunland'. The 
second group contains the Calif ornian ('Marchetti', 
'Payne', 'Waterloo') and French ('Conway Mayet- 
te', Tranquette' and 'Meylan') parental cultivars and 
most of their progeny ('Cisco', 'Lompoc', 'Pedro', 
Tehama', 'Tulare' and 'Vina'). This is not unexpect- 
ed since the three French parental cultivars probably 
originated in the same French region (Isere). In gener- 
al, the cultivars sharing common parents tend to group 
together and with at least one of the parents. 

Among all the genotypes tested, 'Sharkey' appears 
to be the most distantly related to all the others except 
to its progeny ('56-224') and to 'PI-159568' with a 
similarity value of 0.69 and 0.61 respectively. TI- 
159568' itself shows a high similarity value only with 
its progeny, and with 'Sharkey' and 'Sharkey's progeny 
('56-224'). The markers G18-990, R19-755 and T06- 
400 are only present in TI-159568' and 'Sharkey' and 
some of their progeny and the marker R14-1 130 is only 
absent in TI-159568' and 'Sharkey'. It is interesting 
to note that those two genotypes are the only ones 
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Figure 3. Dendrogram of the 19 walnut genotypes studied generated 
by UPGMA cluster analysis of the similarity values shown in Table 

3. 



in this study originating from Asia: i.e., TI-159568' 
from Afghanistan and 'Sharkey' probably from China 
(Tulecke & McGranahan, 1994). 

The cultivar Payne, ancestor of most of the cultivars 
tested, showed, as expected, a similarity value of 0.50 
or more with all 10 cultivars related to it except 'Cis- 
co' (similarity value of 0.47) and 'Sunland' (similarity 
value of 0.44), two cultivars that are two generations 
away from 'Payne' in the pedigree. 

In our research, as detected with other crop species 
(Aruna et al., 1993; Dunemann et al., 1994; Dweikat 
et al., 1993; Hallden et al., 1994), we observe a fair- 
ly close relationship between the known pedigree and 
the genetic similarity obtained with RAPDs. This is 
of great interest in breeding tree crop species since 
very often the pedigree of the cultivars is unknown. 
However, it was not possible to compare the genetic 
similarity values estimated with the Nei & Li index 
with the coefficients of coancestry (Falconer, 1989) 
due to three main reasons. One, that the available 
pedigree consists of a maximum of just two genera- 
tions. Second, the French and Californian cultivars are 
probably related since the genetic base; of the Cali- 
fornian cultivars has been enriched with introductions 
from France. Third, although the markers we have used 
allow us to unequivocally distinguish all the cultivars 
studied, a higher number of markers will probably be 
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Figure 4. Dendrogram of the parental cultivars generated by UPG- 
MA cluster analysis of the similarity values shown in Table 3. 



needed to obtain a dendrogram that accurately reflects 
the similarity matrix; in our case the correlation coef- 
ficient between the cophenetic matrix developed from 
the dendrogram and the similarity matrix was 0.65. 
However, if only the eight initial parental genotypes 
are analyzed, the dendrogram obtained (Figure 4) is a 
very good representation of the similarity matrix since 
the correlation coefficient between the two matrixes 
is 0.86. Moreover, in this case, two clear clusters are 
obtained: one with the genotypes of Asian origin, TI- 
159568' and 'Sharkey', and the other with the French 
and Californian cultivars. This is the expected result 
based on the fact that the gene pool of the Californian 
cultivars has been enriched with French germplasm. 
Comparisons of genetic distances obtained with mole- 
cular markers and theoretical data based on pedigree 
information have been already made in different herba- 
ceous species and generally molecular marker-based 
measures of genetic distance agree with pedigree infor- 
mation (Dudley, 1994). Since pedigree and passport 
data are often unknown or incomplete for many fruit 
and nut tree species (Warburton & Bliss, 1996) RAPDs 
can be a useful tool to assess the degree of similarity of 
accessions or cultivars in these woody species in order 
to select the best parents to obtain new genetic combi- 
nations; this is especially important if we consider the 
long generation times of most fruit and nut tree species 
and, consequently, the length of the breeding process. 

The results obtained show first that the RAPD 
technique can detect enough polymorphism to dif- 
ferentiate among walnut genotypes, even among cul- 
tivars closely related because of their common par- 
ents (for example 'Lompoc'-'Tehama' and 'Chandler'- 
'Howard'). Second, a general pattern of separation 
between Californian-European and Asian genotypes 
was obtained in this study confirming the results previ- 



205 



ously reported with RFLPs by Fjellstrom et al. (1994). 
Third, the RAPD method is a relatively simple tech- 
nique to study genetic relationships in walnut thus 
allowing the study of the influence of genetic drift 
and selection which cannot be predicted using pedi- 
gree information alone. From the data obtained in this 
study we can conclude that RAPD technology can be 
useful in current walnut breeding programs, allow- 
ing the identification of new cultivars as well as the 
assessment of the genetic similarity among different 
genotypes. 
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